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ABSTRACT 

Real-time  interactive  video  applications,  such  as  video  teleconferencing,  present 
difficult  challenges  to  network  designers  due  to  strict  quality  of  service  constraints  and 
the  limitations  of  traditional  video  compression  schemes.  These  limitations  reveal 
themselves  notably  in  two  areas:  poor  error  robustness  and  a  lack  of  flexibility  when 
dealing  with  multicast  scenarios  over  heterogeneous  networks. 

A  more  promising  approach  that  improves  error  robustness  while  also  offering  a 
solution  to  the  network  heterogeneity  problem  is  to  employ  a  layered  video  codec.  This 
thesis  presents  the  implementation  of  a  new  layered  video  codec  scheme.  Block  updating 
coupled  with  an  aging  algorithm  is  used  in  this  scheme  to  select  macroblocks  for 
transmission.  Block  updating  selects  macroblocks  that  have  changed  due  to  scene 
motion,  and  the  aging  algorithm  ensures  that  an  entire  frame  is  transmitted  within  a  set 
time  interval.  Layering  is  accomplished  through  application  of  the  fast  Haar  transform 
and/or  the  discrete  cosine  transform.  Layer  assignments  are  made  by  grouping  bands  of 
coefficients  with  similar  variances.  Quantization  and  encoding  for  motion  video  employs 
both  an  industry  standard  and  uniform  quantization  with  a  custom  variable  length  coding 
table.  For  static  slides,  uniform  quantization  and  a  second  custom  variable  length  coding 
table  are  employed.  Rate  control  is  accomplished  via  the  reduction  of  a  four-dimensional 
operational  distortion  surface  to  a  one-dimensional  optimal  curve  implemented  as  a 
simple  table  lookup  of  quantizers. 
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I.  INTRODUCTION 

A.  BACKGROUND 

Video  teleconferencing  (VTC)  is  expected  to  contribute  significantly  to  the  US 
Navy's  Information  Technology  for  the  21st  Century  (IT-21)  initiative.  IT-21  seeks  to 
change  the  paradigm  to  warfighting  from  a  platform-centric  approach  to  a  network- 
centric  approach,  where  information  superiority  is  leveraged  with  smart  weapons  in  order 
to  achieve  the  desired  result  more  effectively.  The  goal  of  IT-2 1  is  to  link  all  US  Forces 
together  in  a  network  that  enables  transmission  of  voice,  video,  and  data  from  individual 
workstations  seamlessly  to  both  local  and  remote  users  [1][2].  Employing  VTC  over  a 
tactical  network  at  the  battle  group  level  caters  to  several  useful  applications,  such  as 
collaborative  planning,  distance  learning,  remote  maintenance  and  telemedicine. 

1.  Multimedia  Communication  and  Tactical  Video  Teleconference 

(VTC) 

In  general,  multimedia  communications  are  either  unicast  or  multicast.  Unicast 
represents  peer-to-peer  communications  while  multicast  represents  m  to  n 
communications  where  m  ranges  from  1  to  n.  Unicast  includes  any  client-server 
applications,  such  as  video  on  demand  (VOD)  or  IP  telephony.  Multicast  examples 
include  distance  learning  and  remote  conferencing.  The  tactical  VTC  scenario 
considered  here  is  inherently  a  multicast  application  running  over  a  heterogeneous, 
wireless  network.  Here,  "heterogeneous"  implies  that  a  connection  traverses  a  series  of 
links  (each  imposing  potentially  different  bandwidths),  and  the  recipient  workstations 
differ  in  capability  and  capacity;  "wireless"  implies  an  internetwork  of  wireline  local  area 
networks  (LANs)  connected  by  at  least  one  wireless  channel.  As  such,  each  transmitter 
transmits  to  multiple  receivers  in  the  multicast  group.  The  multicast  group  consists  of 
some  combination  of  active  participants  that  are  allowed  to  transmit  and  passive 
participants  that  only  receive.  For  example,  the  problem  of  transmitting  multicast, 


multimedia  traffic  over  ATM  networks  was  examined  in  [3].  This  situation  is  illustrated 
in  Figure  1-1. 


Video 


Workstation  Workstation 

Figure  1-1:  Simple  VTC  Multicast  with  Two  Active  and  Two  Passive  Nodes. 

While  the  network  described  here  affords  large  bandwidth  within  each  shipboard 
LAN,  VTC  realizes  its  true  value  only  if  implemented  within  the  entire  battlegroup.  It  is 
because  the  wireless  channel  constrains  both  the  bit  rate  and  the  bit  error  rate  (each  of 
which,  in  turn,  impacts  the  robustness  of  the  VTC  application)  that  the  tactical  VTC 
network  presents  challenges  not  encountered  with  wireline  networks  and  provides  a 
hostile  environment  for  traditional  video  codecs. 

2.         Video  Compression  and  Robustness 

Since  video  and  audio,  to  a  lessor  degree,  are  bandwidth  intensive,  the  signals 
must  undergo  video  coding  prior  to  transmission,  trading  a  reduction  in  bandwidth  for  a 
reduction  in  quality.  Because  the  human  visual  system  (HVS)  places  greater  relative 
importance  on  lower  frequencies  than  higher  frequencies,  typical  motion  video  is 
perceived  as  primarily  lowpass  [4].  Therefore,  lossless  two-dimensional  (2-D)  transform 
methods  are  used  to  create  a  frequency  domain  representation  of  an  image.  The  HVS 
perceptive  properties  can  then  be  exploited  by  quantizing  the  resulting  coefficients  to 
varying  degrees  of  precision  with  higher  precision  allotted  to  the  lower  frequency 


coefficients.  Quantization  reduces  the  dynamic  range  of  the  coefficients  and  loses 
information,  but  it  enables  fewer  bits  to  represent  the  coefficients.  Usually,  many  of  the 
higher  frequency  coefficients  are  zeroed  out  via  this  quantizing  process;  runs  of  zeroes 
are  created.  Because  zeros  need  not  be  explicitly  represented,  run-length  encoding  (RLE) 
is  utilized  to  generate  a  more  compact  representation  of  the  quantized  coefficients,  which 
is  subsequently  replaced  by  a  more  efficient  lossless  variable  length  code  (VLC).  These 
techniques  are  collectively  called  spatial  compression.  Spatial  compression  is  the  basis 
of  image  compression  standards  like  that  of  the  Joint  Photographic  Experts  Group 
(JPEG). 

A  video  codec  may  simply  treat  each  frame  of  a  video  as  a  separate  still  image 
and  subject  it  to  spatial  compression  independently  from  the  other  frames.  This  approach 
is  known  as  intraframe  coding.  An  example  is  Motion-JPEG  (M-JPEG).  Intraframe 
coding  has  the  advantage  of  superior  error  resilience  since  decode  errors  are  confined 
strictly  to  the  current  frame.  Its  disadvantage  is  that  compression  gain  with  acceptable 
image  quality  is  limited  to  approximately  0.5  bits  per  pixel  (bpp)  [4]. 

For  a  given  quality,  higher  compression  gains  are  realized  if  the  video  codec 
exploits  the  high  degree  of  correlation  that  video  frames  tend  to  exhibit  on  a  frame-to- 
frame  basis.  This  is  called  interframe  coding,  and  it  eliminates  redundancy  by  coding 
only  the  differences  in  successive  frames.  The  compression  gains  achieved  by  interframe 
coding  vary  in  relation  to  the  degree  and  type  of  motion  that  occurred  between  successive 
frames.  Nearly  static-content  frames  exhibit  high  correlation  and  result  in  high 
compression  gain  while  highly  dynamic-content  frames  have  little  correlation  and  result 
in  less  compression  gain.  If  two  successive  frames  have  no  correlation,  perhaps  due  to  a 
scene  change,  interframe  coding  performs  no  better  than  intraframe  coding  and  typically 
performs  worse  due  to  the  overhead  required  to  track  motion.  The  disadvantage  to 
interframe  coding  results  from  the  dependency  between  frames  at  the  decoder.  If  errors 
occur  in  the  current  frame,  the  errors  tend  to  propagate  among  frames  as  well  as  spatially 
within  frames  [5]  [6].  Consequently,  video  codecs  like  that  of  the  Moving  Pictures 


Experts  Group  (MPEG)  incorporate  both  types  of  compression  techniques  in  an  attempt 
to  increase  compression  gain  and  bound  error  propagation. 

3.         Traditional  Video  Codecs 

Low  bit  rate  video  coding  standards  such  as  H.261  and  H.263  perform  best  in 
homogeneous,  unicast  environments.  The  video  server  negotiates  a  desired  Quality  of 
Service  (QoS)  consistent  with  the  desired  video  quality  and  available  bandwidth  prior  to 
delivery.  Since  network  conditions  are  rarely  static,  the  received  video  quality  typically 
changes  due  to  dropped  or  incomplete  frames  caused  by  losses  within  the  network.  With 
the  implementation  of  a  closed  loop  control  scheme  via  feedback  reports  from  the 
recipient,  the  server  can  react  to  the  changing  network  conditions  and  adjust  the 
quantization,  frame  size,  or  frame  rate  in  order  to  vary  the  bit  rate. 

However,  when  traditional  video  codecs  are  applied  in  a  wireless,  multicast, 
heterogeneous  environment,  shortcomings  are  revealed.  First,  the  traditional  scheme 
relies  on  guaranteed  bandwidth  for  delivery,  trading  bandwidth  for  quality.  Selecting  an 
appropriate  quality  (and  therefore  required  bandwidth)  poses  a  dilemma  in  a 
heterogeneous  network.  Since  each  user  is  reached  by  a  different  path  on  the  network, 
each  experiences  different  levels  of  congestion.  Even  with  feedback,  the  controllable 
application  is  faced  with  a  quandary  in  determining  how  to  make  adjustments.  Sending 
high  quality,  high  bandwidth  video  supports  some  users  but  leaves  low  bandwidth 
recipients  with  degraded  video  due  to  high  packet  loss.  If  the  lowest  common 
denominator  is  supported  instead,  all  recipients  are  forced  to  view  lower  quality  video, 
and  the  high  bandwidth  links  are  underutilized.  Clearly,  meeting  the  varied  expectations 
with  a  single  video  stream  is  impractical  and  transmitting  multiple  video  streams  with 
gradations  in  quality  demands  a  much  greater  bandwidth  expense. 

Second,  the  poor  error  robustness  demonstrated  by  traditional  low  bit  rate  video 
coders  is  especially  troublesome  since  retransmission  is  not  practical  in  a  real-time 
application.  Finally,  feedback  itself  is  undesirable  in  a  low  bit  rate  network.  Feedback 
employed  with  the  goal  of  mitigating  congestion  actually  consumes  available  bandwidth, 
causes  an  additional  load  on  constrained  nodes,  and  increases  congestion  further. 


4.  Receiver-Based  Layered  Multicast  (RLM) 

A  promising  approach  that  offers  a  solution  to  the  network  heterogeneity  problem 
while  offering  some  improvement  in  error  robustness  is  the  receiver-based  layered 
multicast  (RLM)  scheme  proposed  by  McCanne  et  al.  [7].  RLM  employs  a  layered  video 
codec  that  transmits  video  in  scalable  layers  that  progressively  refine  quality.  An 
independently  decodable  base  layer  is  generated  that  guarantees  a  minimum  acceptable 
quality.  Separate  enhancement  layers  increase  quality  in  a  hierarchical  manner.  With 
RLM,  each  recipient  can  decode  just  the  base  layer  for  low,  but  acceptable,  quality  or  add 
one  or  more  enhancement  layers  to  improve  quality  as  bandwidth  and  hardware  permit. 
This  idea  is  illustrated  in  Figure  1-2. 
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Figure  1-2:  Overview  of  Layered  Coder/Decoder. 

RLM  represents  a  starting  point  for  designing  an  integrated  approach  to 
improving  robustness.  The  layered  structure  slightly  reduces  the  effects  of  congestion 
because  a  particular  node  only  needs  to  carry  subscribed  layers.  Unsubcribed  layers  can 
be  dropped.  Figure  1-3  illustrates  this  concept.  Furthermore,  earlier  work  by  Rhee  and 
Gibson  indicates  that  layered  video  exhibits  improved  resilience  to  bit  errors  introduced 
during  transmission  because  spreading  bit  errors  across  multiple  layers  has  less  negative 
impact  on  the  reconstructed  video  [8]. 
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Figure  1-3:  Adapting  to  Network  Heterogeneity  Using  RLM. 

B.  THESIS  OBJECTIVE 

The  main  objective  of  this  thesis  is  to  implement  in  Matlab1  the  new  layered  video 
coding  scheme  for  a  tactical  VTC  proposed  by  Parker  [9]  [10]  for  use  in  multicast, 
hetergeneous,  wireless,  asynchronous  transfer  mode  (ATM)  networks.  The  tactical  VTC 
session  is  assumed  to  consist  of  low-motion  video,  such  as  a  "talking  head"  in  the  style  of 
a  typical  newscast  and  static  displays  in  the  style  of  an  overhead  presentation.  No 
assumption  is  made  about  the  relative  proportions  of  the  two  types  of  content;  the  session 
could  be  all  low-motion  video,  a  sequence  of  static  displays,  or  any  combination  of  the 
two  types  in  any  order. 

The  complete  specifications  of  the  coder  are  detailed  in  Table  1-1  [9].  For  the 
implementation  in  this  thesis,  a  color  depth  of  8-bit  grayscale  is  evaluated,  but  the 
technique  can  be  expanded  to  include  4:2:0  sub-sampled  24-bit  color.  Audio 
compression  to  8  kbps  or  less  may  be  accomplished  via  code  excited  linear  prediction 
(CELP)  speech  coding,  but  it  is  not  addressed  further  in  this  work.  Error  robustness  is 
provided  via  a  block  updating  scheme  that  limits  the  impact  of  decoder  errors.  Layering 
is  accomplished  by  judiciously  grouping  the  frequency  domain  content  obtained  from  the 
fast  Haar  transform  (FHT)  and/or  the  discrete  cosine  transform  (DCT);  the  exact  method 
of  frequency  decomposition  depends  on  the  video  content.  Handling  both  motion  video 
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and  static  slides  with  a  single  coder  requires  significant  flexibility  and  compromise  since 
the  frequency  characteristics  of  each  are  different.  Therefore,  the  coder  is  optimized  to 
handle  each  type  of  content  separately  -  separate  layering  and  separate,  custom  VLC 
tables.  Finally,  the  bit  rate  control  issue  is  examined,  and  an  approach  that  reduces  an  n- 
dimensional  rate  control  problem  to  a  simple  table  lookup  is  implemented. 


VTC  Stream Parameter Value 

Video  Bandwidth  64-96  kbps 

Resolution                                1 76x  1 44  (QCIF) 
Frame  Rate                               10  fps 
Color  Depth                             8-bit  gray/4:2:0  24-bit  color 
Audio Bandwidth <  8  kbps 

Table  1-1:  Tactical  VTC  Multimedia  Requirements. 


C.  THESIS  ORGANIZATION 

Chapter  II  considers  techniques  for  coding  video.  A  general  discussion  of  video 
stream  structural  hierarchy,  transform  coding  methods,  quantization  techniques,  entropy 
encoding,  frame  coding,  and  quality  measurement  is  presented.  Chapter  EQ  presents  the 
specific  techniques  utilized  in  coding  both  motion  video  and  static  slides.  Chapter  IV 
presents  results.  Conclusions  and  recommendations  for  future  study  are  given  in  Chapter 
V.  Appendix  A  and  Appendix  B  provide  the  custom  VLC  tables  used  with  motion  video 
and  static  slides,  respectively.  Appendix  C  is  a  Matlab  code  library  of  the  layered  video 
coder  implementation. 


II.         LAYERED  VIDEO  CODER  DESIGN  CONSIDERATIONS 

The  basic  components  of  a  video  coder  are  shown  in  Figure  II- 1 .  This  chapter 
begins  with  an  explanation  of  video  stream  hierarchy  and  then  discusses  each  of  the 
component  parts  displayed  in  Figure  II- 1 .  Motion  compensation  is  present  only  if  the 
coder  attempts  to  exploit  frame-to-frame  correlation.  Otherwise  each  frame  is  processed 
independently.  The  chapter  concludes  by  addressing  a  method  for  quantifying  the 
amount  of  distortion  introduced  by  the  processing. 


Input  Video 


*f 


Transform 


Quantize 


Entropy 
Encode 


Compressed 

Video 

Stream 


Motion 
Compensation 


Figure  II- 1:  Basic  Components  of  a  Video  Coder. 


A.  VIDEO  STREAM  STRUCTURAL  HIERARCHY 


A  video  stream  is  organized  into  a  hierarchy  of  logical  components.  Although  the 
specific  organizational  scheme  varies  with  the  particular  coder  under  consideration,  some 
of  the  most  common  components  are  the  following. 

A  frame  or  picture  is  the  basic  display  unit.  It  is  a  sampled  version  of  the  original 
scene  taken  at  a  particular  instant  in  time.  Each  frame  is  composed  of  a  rectangular  array 
of  pixels  or  points  within  the  frame.  Each  pixel  contains  a  data  structure  indicating  its 
luminosity  or  color.  The  dimensions  of  the  array  define  the  picture  resolution,  given  as 
columns  x  rows,  such  as  the  ITU-T  defined  QCIF  resolution  of  176x144.  Pixels  within  a 
frame  are  organized,  in  order  of  increasing  size,  into  blocks,  macroblocks,  and  groups  of 
macroblocks  (GOBs).  A  block  is  an  8x8  array  of  pixels;  a  macroblock  is  a  16x16  array 
of  pixels  or,  equivalently,  four  blocks.  A  frame  may  be  considered  as  rows  of 


macroblocks.  One  or  more  contiguous  rows  of  macroblocks  are  termed  a  GOB.  This 
hierarchy  is  illustrated  in  Figure  II-2. 
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Figure  II-2:  Organizational  Hierarchy  for  Compressed  Video. 

B.  TRANSFORM  CODING 

Although  compression  through  direct  scalar  quantization  of  pixel  values  is 
possible,  it  is  inefficient.  An  alternative  approach  is  to  use  transform  coding.  Because 
contiguous  pixels  within  a  frame  tend  to  be  highly  correlated,  the  application  of  a  suitable 
linear  transform2  to  decorrelate  the  pixels  yields  two  primary  advantages.  The  first  is  a 
property  termed  "energy  packing"  efficiency.  The  second  is  that  the  resulting 
coefficients  are  more  conducive  to  perceptual -based  quantization  schemes. 


The  transform  should  be  lossless  and  invertable.    Lossless  means  that  no  information  is  lost  through 
application  of  the  transform.  Invertable  means  that  the  original  information  is  recoverable. 
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A  signal  is  decorrelated  if  the  application  of  a  transform  causes  the  signal's 
autocorrelation  matrix  to  become  diagonal;  that  is,  it  uncorrelates  the  resulting 
coefficients.  The  optimal  transform  tightly  packs  all  the  energy  (information)  into  the 
minimum  number  of  coefficients  possible,  resulting  in  the  highest  energy  packing 
efficiency.  Arranging  these  TV  coefficients  in  decreasing  order  of  magnitude  and  retaining 
only  the  first  k  coefficients  gives  the  least  distortion  as  measured  by  mean  squared  error 
(MSE)  compared  to  any  other  set  of  k  coefficients.  Similarly,  a  given  level  of 
quantization  of  the  decorrelated  coefficients  results  in  the  least  distortion  of  the  original 
data  [11].  Because  certain  transform  coefficients  may  hold  greater  perceptual  relevance 
by  the  HVS,  this  dependency  can  be  exploited  by  utilizing  a  frequency-based  transform 
and  then  quantizing-  a  lossy  process  -  with  a  step  size  proportional  to  the  perceptual 
importance  of  each  coefficient. 

1.         Discrete  Cosine  Transform  (DCT) 

Theoretically,  the  discrete-time  Karhunen-Loeve  transform  (KLT)  provides  the 
greatest  energy  packing  efficiency  [11].  However,  two  liabilities  preclude  the  use  of  the 
KLT  in  video  compression.  First,  the  KLT  is  extremely  computationally  intensive  - 
requiring  order  N2  operations.  Second,  the  KLT  is  signal  dependent  -  requiring  a 
separate  eigenvector  calculation  for  each  transformed  data  block.  Instead,  transforms  that 
approach  the  energy  packing  efficiency  of  the  KLT  and  possess  more  efficient  algorithms 
are  utilized  in  video  coders. 

A  widely  used  transform  for  image  processing  is  the  two-dimensional  (2-D) 
discrete  cosine  transform  (DCT).  The  2-D  DCT  provides  energy  packing  performance 
very  close  to  that  of  the  KLT  and  can  be  implemented  with  fast  algorithms  that  reduce  the 
computational  effort  to  the  order  of  Nlo&N  [4].  A  frame  is  transformed  by  partitioning 
the  frame  into  NxN  regions  of  pixels  and  applying  the  2-D  DCT  to  each  individual 
region.  N  can  be  any  integer  provided  that  integer  multiples  of  N  equals  the  overall 
dimensions  of  the  frame.  Because  the  correlation  among  contiguous  pixels  tends  to 
decrease  with  increasing  size  of  the  NxN  region  which,  in  rum,  decreases  the  resulting 
compression  gain,  the  typical  NxN  size  used  with  the  2-D  DCT  is  8x8  (a  block). 
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Denoting  the  original  block  asf(i,j)  and  the  coefficient  block  by  F(u,v),  the  2-D  DCT  is 
given  by  [4] 


N-\N-\ 


F(u,v)  =  —  C(u)C(v)  >   >  f(i, j)cos\ cos , 

N  hh  \     2N     J      V      2N      J 


where 


j  =  0;  =  0 


C(x)    = 


(n-i) 


-=,    x  =  0 

V2 


(n-2) 


1,       otherwise. 
The  inverse  DCT  is  given  by 

/(^•)4SIWC(v)cosf<2i^)cosf^±^V(«,v).         (H-3) 
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u=0  v=0 


y      2N      ;      {      2N      , 


The  result  is  a  block  of  64  coefficients  having  a  spatial  frequency  interpretation  as 
shown  in  Figure  H-3.  The  F(0,0)  coefficient  represents  the  DC  contribution  and  the 
remaining  63  coefficients  represent  the  AC  contribution.  The  different  elements  of  an 
image  map  into  the  frequency  domain  as  shown  in  Figure  II-4  [4].  The  application  of  the 
2-D  DCT  to  an  8x8  block,  where  there  is  typically  little  variation  from  pixel  to  pixel, 
leads  to  a  predominance  of  lowpass  content  in  the  frequency  domain.  Given  this 
condition,  the  magnitudes  of  the  AC  coefficients  are  largest  in  the  region  around  the  DC 
coefficient  and  diminish  with  increasing  spatial  frequency. 
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Figure  II-3:  Frequency  Interpretation  of  DCT  Coefficients. 


Increasing  horizontal  frequency 


DC 

Vertical 
Edges 

Diagonal          / 
Edges      / 

Horizontal 
Edges 

\        /  High 
\    /     Frequencies 

Figure  II-4:  Structural  Decomposition  of  Image  Elements  [4]. 

2.         Discrete  Wavelet  Transform  (DWT) 

Wavelet  coding  is  another  technique  for  compressing  still  images  and  has  been 
shown  to  offer  slightly  better  image  quality  than  DCT-based  schemes  for  similar  levels  of 
compression  at  the  cost  of  greater  computational  complexity  [12].  Applying  a  wavelet 
transform  to  an  image  transforms  that  image  into  a  "multifrequency  component 
representation,"  where  each  component  has  its  own  frequency  characteristics  and  spatial 
features.3  A  discrete  wavelet  transform  (DWT)  filters  and  decimates  an  image  into 


The  wavelet  transform,  like  the  2-D  DCT,  is  lossless  and  invertable. 
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separate  subbands  of  coefficients  containing  a  mixture  of  the  high  frequency  and  low 
frequency  details.  Image  decomposition  is  accomplished  via  two  analysis  filters.  The 
first  extracts  the  low  frequency  content,  the  signal  average;  the  second  extracts  the  high 
frequency  content,  the  signal  details. 

The  simplest  example  of  a  DWT  is  the  fast  Haar  transform  (FHT)  [13].  It  is 
described  by  its  signal  average  equation, 


al(n)=-(x0(2n)  +  x0(2n  +  \))  (H-4) 


and  its  signal  detail  equation, 

d, in)  =  ^ ( x0 (In)  -x0(2n  + 1)) ,  (II-5) 

where  xo  is  the  original  data  vector,  and  vectors  a\  and  d\  are  the  first  order  analysis 
(decomposition)  vectors.  If  xo  contains  2L  elements,  applying  Equations  II-4  and  IL5 
generates  analysis  vectors  of  length  2L;. 

Referring  to  Figure  II-5,  the  LL  subband  is  calculated  by  applying  the  average 
equation  to  the  columns  and  rows  of  the  image.  The  LL  subband  retains  the  lowpass 
information  from  the  original  image  and  presents  a  coarse  representation  of  the  original 
image.  Because  typical  images  have  lowpass  characteristics,  most  of  the  energy  from  the 
image  is  represented  in  this  subband.  The  HL  subband  is  calculated  by  applying  the 
average  equation  to  the  columns  of  the  image  and  the  detail  equation  to  the  rows.  The 
HL  subband  retains  the  vertical  edge  details  from  the  original  image.  Typically,  less 
energy  from  the  original  image  is  represented  in  the  HL  subband  compared  to  that  of  the 
LL  subband.  The  LH  and  HH  subbands  are  found  analogously  with  LH  retaining 
horizontal  edge  details  and  HH  retaining  the  diagonal  edge  details.  The  HH  subband 
typically  has  the  lowest  energy  content.  In  fact,  in  some  applications  the  HH  subband  is 
discarded  in  its  entirety  [14]. 
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Figure  II-5:  Octave-Based  Wavelet  Decomposition. 

In  order  to  differentiate  the  frequency  content  of  an  image  further,  a  second  DWT 
can  be  applied.  Still  referring  to  Figure  II-5,  the  effect  of  a  second  order  octave-based 
decomposition  obtained  by  applying  the  same  analysis  filters  to  the  LL  subband  is 
illustrated.  This  increase  in  the  number  of  subbands  allows  tailoring  additional  stages  in 
the  coder  to  emphasize  the  perceptually  important  details  over  less  perceptual 
background  noise  as  will  be  discussed  in  greater  detail  later. 

3.  Comparison  of  Transforms 

Since  the  2-D  DCT  is  applied  to  pixel  blocks  within  a  frame  that  are  subsequently 
quantized,  it  has  a  tendency  to  create  "blocking"  artifacts  that  disturb  the  continuity  of  the 
reconstructed  frame.  The  same  effect  also  leads  to  the  presence  of  "ringing"  artifacts 
around  sharp  edges.  In  contrast,  the  wavelet  transforms  typically  are  applied  to  the  entire 
image  and  separate  the  entire  image  into  regions  of  high  and  low  frequency  content. 
These  regions  may  be  quantized  and  coded  independently  and  result  in  more  efficient  bit 
allocation.  This  produces  a  more  visually  pleasing,  smoother  reconstructed  image 
compared  to  that  obtained  via  a  2-D  DCT  at  a  comparable  peak  signal-to-noise  ratio 
(pSNR).  In  general,  at  comparable  pSNRs,  wavelet  transform  coders  offer  compression 
gains  superior  to  that  of  DCT-based  coders  [12]. 

However,  several  drawbacks  have  limited  the  utility  of  wavelet-based  video 
compression.  Wavelets  achieve  quality  superior  to  DCT-methods  by  processing  the 
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entire  image  or  frame,  but  traditional  video  coders  exploit  temporal  correlation  at  a 
subframe  (usually  macroblock)  level.  Although  the  error  block  could  be  transformed  via 
a  DWT,  no  significant  advantage  has  been  determined  over  the  DCT,  and  the 
computational  effort  is  greater  [12].  However,  the  frequency  decomposition  offered  by 
DWTs  provides  a  powerful  basis  for  layered  video  coding. 

C.  QUANTIZATION  TECHNIQUES 

After  the  transform,  the  coefficients  are  quantized.  Quantization  is  a  lossy 
process  used  to  reduce  precision  and  zero  out  some  coefficient  values.  The  benefit  is  that 
each  coefficient  can  then  be  represented  with  fewer  bits.  However,  the  information  loss 
is  not  recoverable,  and  the  loss  is  manifested  as  distortion  within  the  reconstructed  image. 
This  distortion  is  termed  quantization  noise. 

The  typical  quantization  scheme,  uniform  quantization,  involves  dividing  each 
coefficient  Fuv  by  the  quantizer  step  size  Quv  and  rounding  the  result  to  the  nearest  integer 
to  produce  a  quantized  coefficient  Fquv  as  follows  [15]: 


F     =  nearest  integer 


F„ 


(H-6) 


The  values  used  in  image  reconstruction  are  then  Fquv  multiplied  by  the  step  size. 
However,  as  Equation  H-6  implies,  the  step  value  may  be  adjusted  for  each  particular 
coefficient  with  Quv  now  representing  elements  of  a  quantizer  matrix.  Choosing  the 
appropriate  step  size  involves  a  trade-off  between  acceptable  error  and  desired 
compression.  Employing  a  small  step  size  yields  low  quantization  noise  but  little 
compression.  The  opposite  is  the  case  for  a  large  step  size.  Although  uniform 
quantization  is  often  used,  the  choice  is  suboptimal  since  the  individual  coefficients  are 
not  distributed  uniformly  [16].  However,  two  broad  approaches  are  employed  to  refine 
the  selection  of  step  size:  HVS  weighting  and  bit  allocation  strategies.  HVS  weighting 
heuristically  refines  the  step  size  based  on  perceptual  relevance  while  bit  allocation 
strategies  attempt  to  spread  errors  optimally  across  all  coefficients. 
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1.  Human  Visual  System  (HVS)  Weighting 

Because  the  HVS  places  greater  relative  importance  on  lower  frequencies  than 
higher  frequencies,  step  sizes  based  on  HVS  modeling  are  selected  such  that  lower 
frequencies  coefficients  are  quantized  more  finely  while  higher  frequency  coefficients  are 
quantized  more  coarsely  [4].  The  HVS  is  also  more  acute  to  luminous  intensity  than 
chromatic  intensity.  Therefore,  different  quantizer  matrices  are  developed  for  each.  An 
example  of  a  luminance  quantization  matrix  widely  used  in  JPEG  compression  is  given  in 
Figure  II-6  [4].  The  dimensions  match  the  8x8  block  size  used  with  the  2-D  DCT.  In 
application  of  a  quantizer  matrix,  the  entire  matrix  can  be  scaled  by  a  multiplicative 
constant  in  order  to  scale  quantization  noise  and  compression  while  maintaining  the 
relative  importance  among  the  coefficients. 
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Figure  II-6:  HVS-Based  Luminance  Quantization  Matrix  [4]. 


Bit  Allocation 


Using  a  bit  allocation  approach,  the  value  of  the  step  size  is  chosen  to  minimize 
distortion  within  a  bit  budget.  It  is  a  classical  resource  allocation  problem,  where  the 
fundamental  trade-off  in  quantization  is  between  rate  (number  of  bits)  and  distortion 
(approximation  error)  formalized  as  rate-distortion  theory.  Several  sophisticated 
algorithms  utilize  Lagrange  methods  applied  to  arbitrary  rate-distortion  curves.  Of  these, 
a  popular  approach  is  varying  the  step  size  in  proportion  to  the  variance  of  the  coefficient 
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[17].  However,  bit  allocation  schemes,  regardless  of  the  methodology,  do  not  account  for 
the  sensitivity  to  different  spatial  frequency  characteristics  of  human  visual  perception. 

D.  ENTROPY  ENCODING 

Further  compression  can  be  realized  through  a  lossless  process  called  entropy 
encoding  that  removes  redundant  information.  The  simplest  style  of  entropy  encoding  is 
run-length  encoding  (RLE).  With  RLE,  a  data  set  is  parsed  to  locate  sequences  of 
repeated  values.  Any  such  sequence  is  replaced  by  a  codeword  consisting  of  a  delimiter, 
the  value,  and  the  number  of  times  the  value  is  repeated.  The  longer  the  sequence  of  the 
repeated  value,  the  greater  the  available  compression.  Following  quantization,  many 
elements  of  a  coefficient  block  typically  have  a  value  of  zero.  Often  many  high 
frequency  rows  and  columns  are  completely  composed  of  zeros.  Accordingly,  it  is 
advantageous  to  scan  the  coefficient  block  in  a  manner  as  to  produce  the  longest  runs  of 
zeros.  In  JPEG  compression,  scanning  the  quantized  coefficient  block  as  a  vector  in 
zigzag  fashion  starting  with  the  DC  coefficient  down  to  the  Fg,8  coefficient  has  been 
shown  to  increase  the  run-length  of  zeros  [4].  Since  the  repeated  value  here  is  known  (0), 
an  adaptation  of  the  basic  codeword  scheme  cited  above  consists  of  the  run-length  of 
zeros  followed  by  the  magnitude  of  the  next  non-zero  value.  If  there  is  no  remaining 
non-zero  value  in  the  block,  a  special  end-of-block  (EOB)  codeword  is  used  instead. 

After  RLE,  the  quantized  coefficient  block  is  represented  by  a  set  of  codewords, 
where  each  codeword  represents  a  symbol  drawn  from  a  larger  source  alphabet.  Variable- 
length  coding  (VLC)  minimizes  the  average  codeword  length  by  assigning  shorter 
codewords  to  the  most  probable  symbols  and  longer  codewords  to  the  least  likely 
occurring  symbols  while  maintaining  each  uniquely  decipherable  (UD).  Huffman  coding 
is  the  most  widely  used  entropy-encoding  algorithm  and  is  guaranteed  to  produce  a 
minimum  average  length,  UD  code  [15].  The  Huffman  algorithm  uses  each  symbol's 
probability  of  occurrence  and  builds  a  prefix  code  using  an  optimum  binary-branching 
tree.  Since  both  the  coder  and  decoder  need  to  use  the  same  coder  and  generating  a 
Huffman  table  is  computationally  expensive,  standard  tables  are  normally  pre-defined 
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using  data  drawn  from  test  images.  Optimal  compression  is  no  longer  guaranteed,  but 
encoding  and  decoding  are  faster,  and  the  need  to  transmit  the  VLC  table  is  avoided. 

E.  FRAME  CODING 

As  discussed  previously,  intraframe  coding  compresses  each  frame  of  a  video  as  a 
separate,  still  image.  Its  advantage  is  superior  error  resilience;  its  disadvantage  is  limited 
compression  gain.  Interframe  coding  exploits  frame-to-frame  correlation  by  coding  only 
the  difference  in  successive  frames,  thereby  allowing  the  potential  for  higher  compression 
gain  than  that  achieved  by  intraframe  coding  alone.  The  disadvantage  to  interframe 
coding  is  in  robustness  since  decode  errors  can  propagate  between  frames  and  spatially 
within  a  frame  depending  on  the  algorithm. 

Because  of  the  compression  required  for  the  target  VTC  scenario,  strict  intraframe 
coding  is  not  an  option.  Therefore,  a  robust  algorithm  for  interframe  coding  is  needed. 
Several  source-coding  techniques  used  to  exploit  temporal  redundancy,  such  as  motion 
compensation,  differential  pulse  code  modulation  (DPCM)  and  block  updating,  are 
discussed  in  [4].  For  a  robust  application  in  a  real-time,  heterogeneous,  multicast 
environment,  block  updating  was  judged  the  most  promising  and  is  discussed  further. 

Interframe  coding  via  block4  updating  is  a  variation  on  intraframe  coding.  With 
block  updating,  each  block  in  the  current  frame  is  compared  to  the  corresponding  block 
in  the  previous  frame,  and  a  distance  metric  is  calculated.  If  the  metric  is  above  a 
threshold  value,  that  block  is  intracoded  and  transmitted.  Otherwise,  no  further 
consideration  is  given  to  that  block;  it  is  skipped.  Thus,  block  updating  conserves 
bandwidth  by  coding  and  transmitting  only  those  blocks  that  have  changed  perceptually 
since  the  previous  frame  [18].    In  low-motion  video,  such  as  the  "talking  head"  scenario, 
motion  is  confined  to  a  relatively  small  region  within  a  frame  and  the  background 
remains  static.  A  significant  bandwidth  savings  is  possible. 


4  In  this  context,  "block"  implies  an  NxN  pixel  region,  not  necessarily  8x8. 
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In  order  for  block  updating  to  be  effective,  a  suitable  distance  metric  must  be 
defined.  Common  distance  metrics  employed  are  MSE,  sum  of  absolute  differences 
(SAD),  and  absolute  sum  of  differences  (ASD)  [4]  [14]. 

Utilizing  block  updating  alone  for  a  tactical  VTC,  where  end-users  may  join  a 
session  that  is  already  in  progress,  has  a  liability.  End-users  joining  late  will  never 
receive  a  block  that  does  not  exceed  the  selection  threshold,  thus  leaving  them  with  a 
partially  reconstructed  video.  In  order  to  mitigate  this  affect,  block  updating  can  be 
combined  with  an  aging  algorithm  that  periodically  forces  block  updates.  Such  an 
algorithm  guarantees  a  full  scene  reception  within  some  set  interval  [14]. 

F.  MEASURING  QUALITY 

Given  that  video  coders  trade  compression  gain  for  image  quality,  quantifying  the 
level  of  distortion  introduced  is  useful  in  evaluating  different  coding  schemes.  A  useful 
measure  of  image  distortion  D  is  the  MSE  between  the  original  ( x )  and  reconstructed 
(x)  images  [4]: 

o=T^IIk.-*,J2  <n-7) 

mly    n=l  m=l 

Using  distortion  D,  the  signal-to-noise  ratio  (SNR)  is  determined  as 

sAra=ioiog10— ,  m-8) 

where  <72  is  the  variance  of  the  original  image.  The  most  widely  published  measure  of 
image  quality  is  the  peak  signal-to-noise  ratio  given  by 

K1 
pSNR  =  \0\oglQ—,  (H-9) 

where  K  is  the  maximum  peak-to-peak  value  in  the  image,  255  for  the  typical  8-bit 
image.  For  example,  a  typical  peak  SNR  for  a  typical  JPEG  encoded  grayscale  image  is 
28  dB  at  0.5  bpp. 

Using  MSE  as  a  measure  of  image  quality  does  have  drawbacks.  MSE  does  not 
directly  define  perceptual  quality  since  all  errors  are  given  equal  weight.  Two 
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compression  techniques  yielding  the  same  MSE  for  an  image  may  deliver  slight 
differences  in  perceptual  quality. 
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III.       LAYERED  VIDEO  CODECS 

A.  LAYERED  CODING  SCHEMES 

Parker  [9]  discusses  several  previous  approaches  to  layered  video  coding.  They 
can  be  classified  into  three  fundamental  categories:  component-based  layering,  spatial- 
based  layering,  and  frequency-based  layering. 

Component-based  layered  coders  transmit  a  base  layer  and  usually  a  single 
enhancement  layer.  A  traditional  DCT-based  coder  (yielding  some  set  quality)  supplies 
the  base  layer.  The  enhancement  layer  improves  this  quality  by  either  providing 
augmenting  information  to  the  base  layer  or  correcting  distortion  present  within  the  base 
layer.  Two  such  schemes  are  proposed  by  Rhee  and  Gibson  [8]  [20].  Implicit  in  the 
component-based  approach  is  that  the  enhancement  layer  may  duplicate  information 
already  present  in  the  base  layer. 

Spatial-based  layered  coders  partition  a  frame  into  hierarchical  areas  of  interest 
and  encode  each  area  separately.  For  example,  the  "talking  head"  frame  shown  in  Figure 
HI-1  may  be  partitioned  into  two  areas  of  interest  with  the  speaker  being  of  primary 
interest  and  the  remainder  of  the  frame  being  of  secondary  interest.  With  spatial-based 
layering,  relatively  more  bandwidth  would  be  allocated  to  the  region  encompassing  the 
speaker  with  the  balance  of  available  bandwidth  being  apportioned  to  the  rest  of  the 
frame.  Bahl  and  Hsu  [21]  have  proposed  a  coder  that  incorporates  content-sensitive 
spatial  decomposition  and  multiresolution  coding.  The  difficulty  with  implementing  a 
spatial-based  coder  is  creating  the  hierarchical  areas  of  interest  dynamically  while 
minimizing  the  overhead  required  to  identify  their  shifts  relative  to  the  layer  assignments. 
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Figure  HI-1:  Sample  "Talking  Head"  Video  Frame. 

Frequency-based  layered  coders  decompose  a  frame  into  subbands  and  then 
arrange  the  subbands  into  individual  layers.  The  frequency  decomposition  may  be 
applied  to  the  entire  frame  or  individual  macroblocks  within  the  frame.  Each  layer  may 
contain  one  or  more  subbands.  The  base  layer  contains  at  least  the  low  frequency 
subband  but  may  also  contain  higher  frequency  subbands  in  order  to  improve  base  layer 
quality.  Figure  1H-2  illustrates  this  concept  using  the  DWT  to  decompose  a  frame,  where 
the  LL  subband  alone  constitutes  the  base  layer,  the  LH  and  HL  subbands  are  combined 
into  a  first  enhancement  layer,  and  the  HH  subband  constitutes  the  second  enhancement 
layer. 
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Figure  III-2:  Basic  Layered  Video  Coder  Using  the  DWT. 

The  principle  advantages  to  frequency-based  layering  are  its  extendibility  and 
flexibility.  Further  decomposing  the  frame  through  application  of  the  DWT  to  each 
subband  depicted  in  Figure  HI-2  creates  a  total  of  16  subbands.  Repeated  application  of 
the  DWT  to  these  subbands  creates  even  more  subbands.  With  a  larger  number  of 
subbands,  there  is  increased  flexibility  in  the  manner  in  which  they  can  be  assigned  to 
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layers.  This,  in  turn,  facilitates  exploiting  each  subband's  perceptual  importance. 
Additionally,  a  greater  number  of  subbands  permits  a  greater  possible  number  of  layers. 
Three  coders  that  utilize  the  frequency-based  approach  are  discussed  in  [14],  [22],  and 
[23]. 

B.  A  LOW-COMPLEXITY,  ADAPTIVE  CODER  FOR  TACTICAL  VTC 

As  cited  above,  several  diverse  approaches  to  designing  layered  coders  have  been 
proposed;  each  emphasizes  different  network  architectures  or  applications  to  varying 
degrees.  However,  there  is  neither  a  consensus  in  identifying  a  preferred,  structured 
approach  nor  a  consensus  in  quantifying  those  parameters  that  make  a  layered  coder 
effective.  This  section  presents  the  development  and  implementation  of  a  new  layered 
coder  that  is  specifically  tailored  for  the  tactical  VTC  scenario  where  limited  transmission 
bandwidth  is  available,  and  the  robustness  of  transmission  is  at  a  premium. 

Both  the  characteristics  of  the  tactical  VTC  application  and  the  desire  to  quantify 
effectiveness  guide  the  strategy  for  developing  and  implementing  the  new  layered  coder. 
Specifically,  the  application  yields  five  requirements.  The  coder  must  1)  provide  a  video 
stream  characterized  by  the  bandwidth,  resolution,  frame  rate,  and  color  depth  detailed  in 
Table  1-1,2)  optimize  compression  adaptively  for  both  low  motion  video  and  static 
slides,  3)  provide  a  low  complexity  architecture  to  minimize  coding  delays  and  power 
requirements,  4)  provide  error  resilient  decoding  at  high  packet  loss  rates,  and  5) 
constrain  bit  rate  to  a  predetermined  average. 

The  guiding  factors  stemming  from  the  desire  to  quantify  effectiveness  are 
twofold.  First,  the  coder  must  provide  a  base  layer  with  acceptable  quality  and  two  (or 
more)  enhancement  layers  that  progressively  improve  perceptual  quality.  Second,  the 
coder  must  minimize  the  bitstream  overhead  required  to  accommodate  the  layering 
structure.  A  functional  diagram  of  the  implemented  coder  is  shown  in  Figure  ni-3.  Each 
component  is  addressed  in  follow  on  sections. 
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Figure  III-3:  Functional  Block  Diagram  of  Proposed  Layer  Coder. 

1.         Temporal  Compression  via  Block  Updating 

Temporal  compression  is  accomplished  via  block  updating  applied  at  the 
macroblock  level.  Only  those  macroblocks  exhibiting  sufficient  change  on  a  frame-to- 
frame  basis  are  encoded.  Although  block  updating  has  been  shown  to  yield  inferior 
compression  performance  relative  to  motion  prediction  algorithms,  block  updating  was 
chosen  because  the  performance  differential  is  small  when  low  activity  video  is 
considered  [24],  and  greater  robustness  is  realized  because  temporal  error  propagation  is 
greatly  limited  and  spatial  error  propagation  is  eliminated.  Block  updating  also  negates 
the  need  for  the  locally  decoded  reference  frame  that  is  common  to  motion  prediction 
schemes.  This  greatly  simplifies  the  coder  by  eliminating  the  overhead  required  by  the 
performance  of  an  inverse  quantization/transform  at  the  coder.  Block  selection,  as 
considered  here,  is  solely  with  regard  to  motion  video  sequences.  Since  static  slide 
sequences  exhibit  little  or  no  motion,  block  selection  via  motion  detection  is  of  limited 
utility  there.  Indeed,  transmissions  during  static  sequences  result  from  considerations 
presented  in  the  next  section,  where  macroblock  aging  is  discussed. 

Sufficient  change  between  corresponding,  frame-successive  macroblocks  is 
determined  by  calculating  a  distance  metric  between  them  and  comparing  the  result  to  a 
threshold.  The  distance  metric  utilized  is  the  non-normalized  ASD  given  by: 
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where  xn  m  represents  the  pixel  value  within  the  current  block  while  x*  n  represents  the 

pixel  value  in  the  reference  block.  The  primary  reasons  for  choosing  ASD  over  other 
metrics  are  twofold.  First,  the  ASD  is  computationally  efficient  compared  to  MSE  and 
SAD.  The  ASD  employs  only  additions  and  subtractions  and  a  single  absolute  value 
operation.  MSE  requires  expensive  multiplication  operations  -  making  it  ill  suited  to 
real-time  applications.  SAD  requires  the  same  number  of  additions  and  subtractions  as 
ASD,  but  it  requires  MxN-  1  more  absolute  value  operations.  Second,  since  the  ASD 
takes  a  single  absolute  value  of  a  sum,  it  acts  as  an  accumulator  and  provides  a  lowpass 
filtering  effect  that  removes  noise  in  pixel  intensities  introduced  through  video  capture. 
This  smoothing  prevents  the  threshold  from  being  exceeded  spuriously  in  otherwise  static 
regions  of  the  frame  sequence  whereas  the  nonlinear  operations  performed  on  a  per-pixel 
basis  in  MSE  and  SAD  tend  to  accumulate  this  noise  energy  and  yield  spurious  block 
selections.  Thus,  the  ASD  metric  allows  bandwidth  to  be  more  effectively  devoted  to 
regions  of  interest  [14]. 

Two  independent  elements  affect  video  quality  and,  therefore,  the  required  bit 
rate:  adequate  motion  detection  (to  prevent  "jerky"  reconstructed  video)  and  control  of 
distortion  introduced  through  quantization.  The  goal  in  motion  detection  is  to  select  the 
maximum  block  selection  threshold  that  adequately  captures  motion.  In  the  video 
sequences  examined,  a  threshold  of  160  (for  ASD)  proved  adequate  for  detecting 
perceptual  motion  and  resulted  in  an  average  of  24.8  macroblocks  selected  per  frame. 

Further  considering  the  issue  of  computational  expense,  the  ASD  metric  is  applied 
to  individual  8x8  blocks  within  the  macroblock;  the  first  block  to  exceed  the  threshold 
triggers  macroblock  selection  and  ends  the  search.  This  avoids  the  expense  of  examining 
the  remaining  blocks  of  the  macroblock.  Additionally,  since  the  HVS  acuity  is  more 
sensitive  to  changes  in  luminous  than  chromatic  intensity  [4],  distance  calculations  are 
confined  to  the  luminous  component  of  each  pixel  even  if  chromatic  information  were 
available. 
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Since  motion  in  VTC  scenes  tends  to  be  confined  to  discrete  objects  within  the 
scene  (as  opposed  to  scene  motion  caused  by  a  camera  pan),  search  efficiency  is  slightly 
affected  by  the  order  in  which  the  individual  blocks  are  examined.  The  approach  that 
proved  more  efficient  in  the  test  video  sequences  considered  here  is  to  maximize  the 
distance  between  the  first  two  blocks  examined.  As  shown  in  Figure  III-4,  two  search 
patterns  were  compared:  a  clockwise  search  starting  from  the  upper  left  block  and  an  X- 
pattern  search  that  examines  the  upper  left  block  followed  by  the  lower  right.  Given  that 
a  macroblock  was  selected  for  transmission,  the  X-pattern  resulted  in  a  2.5%  decrease  in 
the  average  number  of  blocks  examined  per  frame. 
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Figure  III-4  Block  Search  Order;  a.)  Clockwise  and  b.)  X-pattern. 

Still  greater  search  efficiency  is  realized  by  using  the  X-pattern  search  and 
varying  the  starting  block  of  each  macroblock  to  match  the  anticipated  motion  at  that 
point  in  the  frame.  Because  motion  in  VTC  sequences  is  fairly  confined,  macroblocks 
tend  to  be  selected  in  the  same  manner  frame-to-frame.  For  example,  a  speaker  shifts  left 
or  right  and/or  slightly  up  or  down.  Therefore,  search  speed  is  increased  by  having  the 
search  algorithm  remember  the  identity  of  the  specific  block,  termed  the  "anchor",  which 
caused  a  particular  macroblock  to  be  selected  in  the  previous  frame.  For  the  subsequent 
distance  metric  calculations,  search  begins  at  the  anchor.  If  the  anchor  again  causes 
selection  or  if  the  macroblock  is  not  selected,  the  anchor  identity  is  unchanged.  If 
another  block  causes  selection,  the  anchor  identity  is  updated.  Using  this  search  scheme 
produced  an  additional  20%  reduction  in  the  number  of  blocks  examined  for  selected 
macroblocks.  A  more  complex  approach  not  examined  here  is  to  remember  the  two 
blocks  that  most  frequently  cause  selection  and  tailor  the  search  accordingly. 
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2.  Aging  Algorithm 

Utilizing  this  block  updating  scheme  alone  for  macroblock  refreshment  exposes 
some  undesirable  performance  characteristics.  One  such  consequence  is  termed 
hysteresis.  Consider  an  arbitrary  macroblock  whose  content  is  changing  temporally  due 
to  the  movement  of  an  item  within  its  spatial  bounds.  The  item  travels  from  its  initial 
position  along  some  trajectory  to  its  final  position.  At  some  point  in  the  trajectory,  let  the 
change  in  the  macroblock' s  content  be  sufficient  to  exceed  the  motion-detection 
threshold,  so  the  macroblock  is  selected  for  update.  If  the  final  position  of  the  item  has 
not  yet  been  reached  at  the  time  of  macroblock  selection,  the  item  continues  movement  to 
its  destiny.  Once  the  final  position  is  reached,  hysteresis  occurs  if  the  distance  between 
this  final  position  and  the  item's  position  at  the  last  macroblock  update  is  insufficient  to  * 
force  another  macroblock  selection;  that  is,  the  distance  is  less  than  the  threshold.  In  this 
case,  the  displayed  macroblock  at  the  receiver  is  left  with  a  persistent  error.  As  another 
illustration  of  hysteresis,  consider  a  frame  sequence  depicting  only  slight  motion 
contained  within  an  arbitrary  macroblock' s  spatial  position.  If  the  change  between 
successive  macroblocks  as  calculated  by  the  distance  metric  is  below  the  threshold  value, 
the  macroblock  is  not  selected  for  encoding.  If  several  frames  in  the  video  sequence 
continue  to  depict  similar  slight  motion  below  the  threshold,  the  displayed  macroblock  at 
the  receiver  eventually  shows  a  persistent  error. 

Another  problem  is  in  the  duration  the  error  artifacts  (introduced  by  the  channel 
through  dropped  or  corrupted  packets)  are  maintained  in  a  receiver's  reconstructed  video. 
Error  artifacts  in  the  dynamic  portion  of  a  scene  tend  to  last  only  a  single  frame  because 
block  updates  occur  frequently.  However,  any  such  error  in  a  relatively  static  region 
tends  to  persist  longer  due  to  the  much  lower  frequency  of  block  updating  there. 

A  final  problem  occurs  when  new  participants  are  allowed  to  join  a  VTC  already 
in  progress  (dynamic  multicast).  Since  only  those  macroblocks  depicting  motion  above 
the  perceptual  threshold  are  transmitted,  new  participants  receive  only  a  portion  of  the 
current  scene.  Although  the  portion  received  is  the  most  dynamic  region  of  frame  (e.g. 
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the  speaker),  a  speaker  completely  disassociated  from  the  background  yields  an  awkward 
reconstruction. 

Complementing  the  block  updating  scheme  with  an  aging  algorithm  that 
intermittently  forces  macroblock  updates  alleviates  these  problems.  The  general 
principle  is  that  the  coder  monitors  the  number  of  frames  since  each  macroblock  was  last 
encoded  -  its  "age."  If  a  macroblock' s  age  exceeds  a  predetermined  interval,  that 
macroblock  is  flagged  for  encoding  and  transmission.  Thus,  the  aging  algorithm 
guarantees  a  maximum  period  between  macroblock  updates  and  bounds  both  the  duration 
of  hysteresis  errors  and  the  persistence  of  visual  artifacts  caused  by  the  channel.  Such 
bounding  also  ensures  that  new  participants  to  a  dynamic  multicast  session  can  construct 
an  entire  frame  in  a  timely  fashion. 

Obviously,  forcing  selection  of  macroblocks  for  transmission  that  would  not  have 
been  selected  otherwise  increases  the  bandwidth  requirement,  but  this  impact  can  be 
mitigated  by  the  manner  in  which  the  aging  algorithm  is  implemented.  Important 
considerations  are  the  following.  Simply  increasing  the  interval  for  macroblock  selection 
by  aging  decreases  the  required  bandwidth  but  increases  the  persistence  of  decode  errors 
at  receivers  and  prolongs  the  time  required  for  new  participants  to  receive  a  full  frame. 
Merely  forcing  a  macroblock  update  after  n  frames  have  passed  without  selection  leads  to 
an  undesirable  correlation  in  updates  following  scene  changes.  Although  motion  within  a 
scene  tends  to  disperse  updates  to  some  extent,  a  sufficiently  static  background  region 
would  still  lead  to  correlation  of  a  significant  fraction  of  the  block  updates.  The  worst 
case  is  realized  by  a  scene  change  where  the  new  scene  is  entirely  static,  such  as  an 
overhead  slide.  In  this  case,  the  bit  rate  would  spike  every  n  frames.  In  order  to  avoid 
undesirable  spikes  in  bit  rate,  it  is  advantageous  to  spread  the  number  of  macroblocks 
selected  by  aging  evenly  over  time.  This,  in  turn,  requires  a  scheme  that  ages  each  block 
independently. 

The  aging  algorithm  developed  for  the  coder  implemented  here  tracks  the  age  of 
each  macroblock  indirectly.  Instead  of  counting  the  frames  since  a  given  macroblock 
was  last  updated,  the  coder  maintains  an  update  vector  identifying  the  number  of  frames 
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remaining  until  each  macroblock  must  be  updated.  Each  value,  m,  in  the  update  vector  is 
chosen  from  a  discrete  uniform  distribution  in  the  range  [  1  ,n]  and  is  interpreted  as  an 
update  scheduled  m  frames  in  the  future.  Using  a  uniform  distribution  to  schedule 
updates  smoothes  macroblock  selection  over  n  frames  and  decorrelates  the  selections  due 
to  aging.  Choosing  the  aging  intervals  randomly  also  prevents  events,  such  as  scene 
changes,  from  correlating  updates  and  causing  periodic  spikes  in  bit  rate.  The  specific 
value  chosen  for  n  controls  the  tradeoff  between  the  additional  bandwidth  required  and 
coder  responsiveness.  For  a  given  value  of  n,  the  average  number  of  macroblocks 
selected  through  aging  per  frame,  Mn< ,  is 

7T         2M  /TTT  „x 

Mw   =-, r,  (HI-2) 

(n  +  1) 
where  M  is  the  total  number  of  macroblocks  in  a  frame. 

The  final  block  selection  algorithm  is  performed  by  the  function,  m_blk_id_xr.m. 
m_blk_id_xr.m  is  provided  in  Appendix  C,  and  it  incorporates  motion  detection  and  aging 
as  follows.  As  each  frame  is  captured,  the  update  vector  entry  corresponding  to  each 
macroblock  is  decremented  by  one.  As  each  macroblock  is  processed  for  selection  in  a 
given  frame,  the  coder  examines  the  macroblock' s  entry  in  the  update  vector.  If  its 
corresponding  entry  has  reached  zero,  the  macroblock  is  selected  for  transmission. 
Otherwise,  the  distance  metric  is  applied  to  determine  if  the  macroblock  should  be 
selected  due  to  motion.  If  either  criterion  is  met,  a  new  random  entry  is  generated  for  that 
update  vector  position.  The  order  of  these  two  events  is  important.  Since  the  distance 
metric  need  not  be  calculated  if  the  macroblock  is  selected  by  aging,  there  is  net  decrease 
in  the  number  of  calculations  required  to  select  a  macroblock  for  transmission. 

For  the  assumed  VTC  video  format  of  176x144  QCIF,  M  is  99.  At  the  assumed 
frame  rate  of  10  fps,  setting  n  to  20  guarantees  all  macroblocks  are  encoded  within  two 
seconds  and  yields  9.43  as  the  average  number  of  macroblocks  selected  through  aging  per 
frame.  However,  the  true  bandwidth  impact  is  less  than  this  value  since  some  of  the 
macroblocks  selected  via  aging  would  have  been  selected  by  motion.  For  the  test  motion 
video  sequences  considered  here,  the  average  number  of  macroblocks  selected  per  frame 
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increased  to  31.4  (from  24.8  without  aging).  The  payoff  for  this  modest  increase  in 
bandwidth  is  that  all  the  problems  encountered  by  using  the  distance  metric  alone  for 
macroblock  selection  are  bounded  to  this  duration.  The  duration  of  two  seconds 
represents  a  compromise  judged  acceptable  between  the  additional  bandwidth  required 
and  the  desired  coder  responsiveness. 

3.         Layering  Strategy 

Macroblocks  selected  for  transmission  are  decomposed  in  frequency  using  a 
DWT.  Performing  the  selection  process  prior  to  the  transform  reduces  computational 
cost  because  the  transform  is  only  applied  to  those  macroblocks  requiring  transmission. 
The  DWT  was  chosen  since  frequency  decomposition,  as  discussed  earlier,  offers  the 
most  flexibility  in  the  subsequent  populating  of  layers.  However,  the  task  of  determining 
a  consistent,  extendable  scheme  for  determining  an  appropriate  number  of  layers  and  the 
manner  in  which  the  frequency  content  within  each  macroblock  was  to  be  apportioned 
across  those  layers  remains. 

Parker  [9]  proposes  a  set  of  heuristic  guidelines  to  determine  the  appropriate  layer 
assignments  and  bit  allocation.  First,  as  layers  are  to  be  hierarchical  in  importance,  layer 
assignments  should  map  frequency  content  to  that  hierarchy  in  a  manner  consistent  with 
perceptual  importance.  Second,  the  base  layer  must  provide  acceptable  quality,  and  the 
addition  of  each  enhancement  layer  must  provide  a  perceptual  improvement  in  quality. 
Since  the  broad  goal  in  image  or  video  coding  is  to  remove  information  that  is  not 
perceptually  relevant,  transmitting  a  layer  that  provides  no  perceptual  improvement  in 
quality  wastes  bandwidth  and  is,  therefore,  undesirable.  Third,  the  number  of  bits 
assigned  to  each  layer  should  be  substantive  so  that  dropping  a  layer  potentially  decreases 
congestion  in  the  network.  Finally,  the  bandwidth  consumed  by  the  image  data  should 
also  be  sufficient  to  avoid  an  excessive  relative  consumption  by  network  control  symbols 
and  overhead.  This  is  especially  critical  for  low  bit  rate  video. 

The  DWT  chosen  was  the  fast  Haar  transform  (FHT).  The  FHT  has  several 
desirable  properties  with  regard  to  minimizing  coder  complexity.  First,  the  FHT,  as  a 
real  transform,  avoids  the  necessity  for  complex  arithmetic  and  simplifies  storage. 
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Second,  the  FHT  is  not  computationally  demanding,  requiring  only  addition,  substraction, 
and  left-  and  right-shifts  [13].  Finally,  unlike  more  sophisticated  wavelet  transforms,  the 
FHT  does  not  require  extending  or  padding  the  data  set.  However,  the  simplicity  of  the 
FHT  may  lead  to  blocking  artifacts  at  high  compression  levels  since  the  average  and 
detail  calculations  are  confined  only  to  contiguous  pixels. 

Since  the  defining  equations  of  the  FHT  and  their  manner  of  application  to  affect 
a  first  order  frequency  decomposition  were  discussed  previously  in  Section  II.B.2,  only  a 
summary  of  the  first  order  FHT  decomposition  is  detailed  in  Table  III-l.  The  operation 
is  accomplished  via  the  function  fht.m  as  provided  in  Appendix  C.  Again,  higher  order 
frequency  decompositions  may  be  accomplished  by  recursively  applying  the  FHT  to  each 
of  the  subbands  created  by  the  next  lower  order  decomposition. 
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Table  III-l:  Significance  and  Determination  of  DWT  Subbands. 

Although  a  greater  number  of  layers  would  offer  more  flexibility  in  managing 
quality  and  congestion,  the  coder  restricts  the  number  of  layers  to  three.  The  decision  to 
limit  the  number  of  layers  to  three  was  driven  primarily  by  the  constraint  of  a  target  bit 
rate  in  range  of  64-96  kbps.  At  these  bit  rates  and  with  each  layer  consuming  an  equal 
amount  of  bandwidth  overhead,  three  layers  appeared  to  be  the  limit  in  terms  of 
producing  enhancement  layers  that  provided  a  perceptual  improvement  in  quality  while 
maintaining  a  base  layer  that  rendered  an  acceptable  quality. 

Establishing  a  suitable  layered  structure  for  motion  video  sequences  can  be  posed 
as  the  following  conditional,  two-part  problem.  Given  that  n  layers  are  desired, 
determine  the  degree  to  which  a  selected  macroblock  is  decomposed  in  frequency  and  the 
manner  in  which  the  resulting  subbands  are  assigned  into  layers.  Parker  [9]  proposes 
using  a  variant  of  a  split-and-merge  algorithm,  which  was  originally  proposed  by  Diab,  et 
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al.  [25]  to  identify  regions  of  equivalent  activity  in  the  spatial  domain,  and  applying  it  at 
the  macroblock  level  in  the  frequency  domain  to  identify  region  of  similar  energy  and 
perceptual  content.  The  particulars  of  the  implementation  follow,  but  the  essence  of  the 
approach  is  that  a  selected  macroblock  is  uniformly  decomposed  in  frequency  via  the 
FHT,  subbands  of  approximately  equal  variance  are  merged,  and  the  resulting  regions  are 
apportioned  into  individual  layers.  These  steps  were  performed  on  representative  video 
sequences,  and  the  ensuing  layering  structure  is  implemented  within  the  coder. 

Utilizing  representative  video  sequences,  each  selected  macroblock  is  uniformly 
decomposed  in  frequency  by  recursively  applying  the  FHT  until  the  desired  number  of 
subbands  is  obtained.  The  first  order  analysis  creates  four  8x8  subbands;  the  second 
order  analysis  produces  sixteen  4x4  subbands.  The  third  order  analysis  results  in  sixty- 
four  2x2  subbands.  Extending  the  recursion  to  its  limit  yields  128  subbands  consisting  of 
a  single  point.  In  practical  terms,  a  second  order  analysis  as  shown  in  Figure  EI-5  proved 
sufficient  for  three  layers. 


16x16 
Macroblock 


FHT 

► 


8x8 
Subband 

FHT 


4x4 

Figure  III-5:  Second  Order  FHT  Decomposition  of  a  Macroblock. 

Next,  the  variance  of  the  coefficient  set  composing  each  subband  was  determined 
across  all  frames  of  video.  These  variances  were  then  used  as  a  subband  merging  metric. 
Two  benefits  are  afforded  by  employing  subband  variance  as  the  merging  metric.  First, 
with  motion  video,  the  variance  of  coefficient  sets  appear  to  possess  an  inverse 
relationship  to  spatial  frequency  and,  by  extension,  to  perceptual  importance.  That  is,  the 
more  perceptually  relevant  subbands  at  lower  frequencies  exhibit  higher  variances. 
Consequently,  differences  in  variance  provide  a  convenient  mechanism  for  assigning 
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subbands  to  a  layered  hierarchy.  Second,  grouping  subbands  with  similar  variances 
simplifies  coder  architecture  since  each  group  can  employ  a  common  quantizer.  Several 
quantization  algorithms  use  variance  as  an  indication  of  the  dynamic  range  exhibited  by 
coefficients  and  allocate  bits  by  varying  quantizer  step  size  in  inverse  proportion  to 
variance.  The  particulars  of  the  quantization  scheme  employed  are  discussed  in  more 
detail  later. 

The  subband  variances  calculated  from  a  first  order  analysis  of  the  test  video 
sequences  are  shown  in  Figure  HI-6.  Those  from  a  second  order  analysis  are  shown  in 
Figure  rH-7.  As  Figure  H[-6  illustrates,  subband  variance  provides  a  good  indication  of 
the  energy  concentration  within  each  subband.  For  instance,  since  motion  video  is 
characteristically  lowpass,  the  variance  is  largest  in  the  LL  subband  and  smallest  in  the 
HH  subband.  By  extension,  subband  variance  therefore  serves  as  a  relative  indicator  of 
the  perceptual  importance  among  subbands  -  an  observation  that  suggests  subband 
variance  should  dictate  layer  assignments.  As  Figure  III-7  illustrates,  a  second  order 
analysis  further  separates  the  frequency  content  of  the  subband  to  which  it  is  applied. 
The  LL  subband  decomposes  into  the  LLLL,  LLLH,  LLHL,  and  LLHH  subbands 
containing  the  lowpass,  horizontal,  vertical,  and  diagonal  edge  details,  respectively, 
which  were  previously  lumped  into  the  LL  subband  alone.  Performing  the  second  order 
analysis  of  the  LH  (horizontal  detail)  subband  apportions  energy  in  a  manner  symmetrical 
to  the  original  first  order  analysis;  that  is,  as  the  LH  subband  is  found  in  the  northeast 
quadrant  of  Figure  IQ-6,  the  energy  distribution  attained  by  the  second  order  analysis  is 
concentrated  in  the  northeast  and  northwest  sub-quadrants  with  a  slightly  greater  energy 
in  the  northeast  sub-quadrant  as  shown  in  Figure  III-7.  Similar  observations  can  be  made 
for  the  HL  and  HH  subbands.  These  characteristics  further  leverage  the  argument  for 
using  subband  variances  to  make  layer  assignments  in  a  hierarchical  manner. 
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Figure  III-6:  Subband  Variances  after  a  First  Order  Analysis  (Motion  Video). 
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Figure  III-7:  Subband  Variances  after  a  Second  Order  Analysis  (Motion  Video). 

After  variance  data  had  been  gathered  for  each  subband  at  the  desired  order  of 
analysis,  the  next  step  was  to  merge  adjacent  subbands  exhibiting  similar  variances  into 
an  entity  termed  a  "partition."  The  criterion  outlined  by  [9]  is  to  merge  adjacent 
subbands  k\  and  kj  when 
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where  Nb  is  the  total  number  of  subbands  created  by  successive  application  of  the  FHT, 
and  a2min  and  a2max  are  the  minimum  and  maximum  variances,  respectively,  found  among 
all  the  subbands.  Applying  the  merge  algorithm  to  the  subbands  listed  in  Figure  III-7 
results  in  the  partitions  shown  in  Figure  III-8.  Assuming  that  subbands  are  statistically 
independent,  the  variance  of  each  partition  P*  is  now  simply  the  sum  of  the  variances  for 
the  subbands  £,  comprising  that  partition: 
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Figure  III-8:  Partitions  Resulting  from  Merge  Algorithm  (Motion  Video). 

Next,  these  partitions  are  assigned  to  layers  L,  until  the  requisite  number  of  layers 
were  created  using  the  set  of  heuristic  rules  proposed  in  [9]. 


Rule  1:  No  layer  may  have  a  greater  variance  than  a  lower  layer.  That  is,  given  N 
layers, 
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Rule  2:  Layers  must  be  populated  in  order  of  increasing  frequency.  A  layer  may 
not  contain  a  partition  of  lower  frequency  content  than  any  layer  below  it. 

Rule  3:  Partitions  that  meet  the  criterion  given  by  Equation  HI-3  are  assigned  to 
the  same  layer  even  if  the  partitions  are  non-contiguous. 

Rule  4:  Partitions  are  applied  to  layers  in  a  symmetric  fashion. 

Rule  5:  If  more  than  two  subbands  comprising  a  coarser  subband  remain  as 
partitions  after  merging  and  applying  the  above  rules,  all  of  the  partitions 
comprising  the  coarser  subband  are  merged  together  into  one  partition. 
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Rule  6:  If  one  or  more  partitions  is  moved  between  layers,  as  required  to  achieve 
a  more  balanced  distribution  of  bit  rates  or  quality,  move  the  partition(s)  with  the 
lowest  variance  if  promoting  to  a  higher  layer  and  the  partition(s)  with  highest 
variance  if  demoting  to  a  lower  layer. 

Application  of  these  rules  to  the  partitions  shown  in  Figure  ni-8  culminated  in  the 
final  layering  scheme  for  motion  video  sequences  shown  in  Figure  M-9.  The  base  layer 
is  essentially  a  lowpass-filtered  version  of  the  original  macroblock,  and  the  two 
enhancement  layers  progressively  added  higher  frequency  details.  Additionally,  since  the 
LL  subband  retains  many  of  the  perceptual  properties  of  the  original  macroblock,  it  was 
further  transformed  using  the  2-D  DCT  via  the  function,  dct_of_fht.m.  This  additional 
transform  allowed  the  LL  subband  to  be  processed  using  JPEG-based  quantization  and 
encoding  techniques  to  maximize  retention  of  the  most  perceptually  relevant  information. 

Summarizing  the  process  as  implemented  in  the  coder,  each  macroblock  selected 
for  transmission  is  decomposed  in  frequency  using  the  FHT.  The  LL  subband  is  further 
transformed  via  the  2-D  DCT  as  previously  discussed  and  assigned  to  layer  I.  The  HH 
subband  is  assigned  to  layer  HI  in  its  entirety.  The  HL  and  LH  subbands  are  decomposed 
with  a  second  application  of  the  FHT.  The  resulting  subbands  are  then  partitioned  and 
assigned  to  layers  II  and  IE  as  appropriate. 
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Figure  III-9  Final  Layering  Scheme  for  Motion  Video  Sequences. 

The  situation  with  static  slide  sequences  consisting  of  text  and  line  drawings  is 
much  different.  Sequences  such  as  these  demonstrate  a  greater  dependency  on  their 
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higher  frequency  components  for  perceptual  recognition.  A  hierarchical  layering  scheme 
based  on  the  lowpass  characteristics  of  motion  video  yields  blurred,  indistinct 
reconstructions  until  the  higher  frequency  components  are  included  as  well.  Such  a 
structure,  of  course,  is  contrary  to  the  principle  of  layered  video  transmission.  Therefore, 
a  separate  layering  scheme  is  needed  if  the  video  stream  is  to  include  both  types  of 
sequences. 

The  foregoing  remarks  not  withstanding,  pursuing  the  split-and-merge  algorithm 
still  provides  valuable  insight.  The  variances  of  the  subbands  produced  by  a  first  order 
FHT  analysis  of  the  text  and  line  drawing  static  sequences  are  shown  in  Figure  HI- 10; 
those  of  a  second  order  FHT  analysis  are  shown  in  Figure  HI-1 1 .  Compared  to  Figure 
ni-6  and  Figure  IQ-7  -  the  analogous  results  for  motion  video  sequences  -  it  is  clear  that 
energy  is  much  more  evenly  distributed  among  subbands  of  static  slides.  In  practical 
terms,  this  implies  a  much  more  complex  relationship  between  variance  and  perceptual 
importance. 
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Figure  111-10:  Subband  Variances  after  a  First  Order  Analysis  (Static  Slides). 
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Figure  III-ll:  Subband  Variances  after  a  Second  Order  Analysis  (Static  Slides). 

Applying  the  split-and-merge  as  before  results  in  the  partitions  identified  in 
Figure  HI- 12.  Using  the  rules  of  layer  assignment  as  before,  a  reasonable  layering 
scheme  might  assign  partitions  Pi,  P2,  and  Pa,  to  layer  I.  However,  this  implementation 
led  to  a  poor  reconstruction.  Even  adding  P3,  P5,  and  P$  so  that  the  vast  majority  of  the 
energy  was  included  in  the  reconstructed  image  failed  to  provide  acceptable  quality. 
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Clearly  contrary  to  the  motion  video  case,  variance  alone  is  a  poor  guide  to  determining 
perceptual  relevance. 
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Figure  111-12:  Partitions  Resulting  from  Merge  Algorithm  (Static  Slides). 

Producing  an  "effective"  base  layer  required  allocating  a  portion  of  the  frequency 
content  from  each  of  the  8x8  subbands  to  it.  Therefore,  while  maintaining  symmetry  in 
assignment,  those  4x4  subbands  that  exhibited  comparatively  larger  variances  among  the 
four  subbands  derived  from  the  same  parent  subband  were  allocated  to  layer  I.  The 
remaining  subbands  were  divided  between  the  remaining  layers  in  order  of  increasing 
frequency.  The  final  layering  scheme  for  static  slide  sequences  is  shown  in  Figure  HI- 13. 
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Figure  111-13:  Final  Layering  Scheme  for  Static  Slide  Sequences. 

Although  the  final  layering  scheme  does  not  stem  directly  from  the  partitions  in 
Figure  HI- 12,  continued  analysis  on  them  is  insightful  nonetheless.  Merging  partitions 
with  similar  variance  reduces  the  partitions  to  those  shown  in  Figure  HI- 14.  The 
difference  in  variances  of  partitions  Pi  and  P2  is  only  slightly  too  large  to  permit  their 
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merging  by  the  criterion  of  Equation  HI-3.  The  difference  is  small  enough,  however,  to 
justify  quantizing  both  bands  with  the  same  step  size.  The  simplicity  gained  in 
quantizing  both  bands  together  balances  any  potentially  suboptimal  bit  allocation. 
Therefore,  the  final  partitions  for  quantization  purposes  are  as  given  in  Figure  EI- 15. 
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Figure  111-14:  Partitions  after  Merging  Similar  Non-Contiguous  Partitions. 
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Figure  111-15:  Partitions  for  Quantizing  Purposes  (Static  Slides). 

4.         Quantization  and  Entropy  Encoding 

After  the  transform  stage,  the  coefficients  are  quantized  and  subjected  to  entropy 
encoding.  For  motion  video  sequences,  the  process  is  depicted  in  Figure  EI- 16.  The  base 
layer  (LL  subband)  is  quantized  using  the  luminance  matrix  (Figure  II-6)  via  the  function, 
quantizer  Jim,  run-length  encoded  using  a  zigzag  scan  via  the  function  zzjb.m,  and 
converted  to  a  VLC  using  the  luminance  VLC  table  suggested  in  [4].  The  conversion  to  a 
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VLC  is  accomplished  stepwise  via  three  functions:  make_it_compact.m,  parse  Jtiuff.m, 
and  get_bits_H.uff.rn.  This  approach  leverages  the  LL  subband's  retention  of  the  lowpass 
characteristics  of  the  original  macroblock.  The  implementation  is  as  discussed  previously 
in  Sections  n.C  and  ILD  with  the  scaling  factor  of  the  quantization  matrix  designated  as 
qi5.  The  remaining  subbands  are  each  quantized  uniformly  using  a  common  step  size  for 
all  its  coefficients.  Using  variance  as  an  indication  of  the  dynamic  range  of  the 
coefficients  within  a  given  subband  and  comparing  to  Figure  M-9,  it  is  reasonable  to 
quantize  all  the  subbands  composing  each  enhancement  layer  with  the  same  step  size. 
Therefore,  the  quantizer  step  size  for  layer  II  is  set  to  Q&,  that  of  layer  HI  is  set  to  q3. 
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Figure  111-16:  Quantization  and  Coding  for  Motion  Video  Macroblocks. 

Unlike  the  JPEG  based  coding  of  the  LL  subband,  zigzag  scanning  of  the 
quantized  FHT  coefficients  provided  no  apparent  coding  gain.  Instead,  trials  indicated  a 
simpler  horizontal  raster  scan  was  adequate  for  all  subbands  except  HL.  The  HL  showed 
a  slight  preference  for  a  vertical  raster  scan.  This  seems  rational  given  the  frequency 
detail  represented  in  this  band.  The  scan  orders  are  summarized  in  Table  III-2  where  the 
scan  order  applies  to  the  indicated  subband  as  well  as  its  child  subbands.  (The  LL  entry 
pertains  only  to  the  coding  of  the  static  content  macroblocks  as  discussed  later,  but  it  is 
included  here  for  completeness.)  Horizontal  raster  scans  are  performed  by  the  function, 


5  As  executed  in  the  code,  the  scaling  factor  qi  is  a  parameter  in  quantizer  Jim.  A  value  of  16  for  qj  means 
no  scaling  of  the  values  in  Figure  II-6.  A  value  smaller  than  16  results  in  finer  quantization  and  less 
quantization  noise,  qi  must  be  positive. 
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raster.m;  vertical  raster  scans  are  performed  by  the  function,  vertical.m.  Upon 
completion  of  the  run-length  encoding  accomplished  by  the  functions 
make_it_compact.m  and  parse _3D.m,  each  coefficient  is  represented  by  a  set  of 
codewords.  Each  codeword  is  then  mapped  to  an  entry  in  a  custom  VLC  table  via  the 
function  get_bits2.m.  The  VLC  table  is  provided  in  Appendix  A.  Its  structure  mirrors 
the  three-dimensional  (3-D)  event  structure  employed  by  the  H.263  coding  standard  [26], 
and  the  methodology  of  its  creation  is  presented  in  the  next  section. 


r,       ♦  c  uu    a       Raster  Scan 
Parent  Subband  „   , 

Order 


LL 

Horizontal 

LH 

Horizontal 

HL 

Vertical 

HH 

Horizontal 

Table  III-2:  Scan  Order  for  Run-Length  Encoding  Quantized  Coefficients. 

The  quantization  and  encoding  process  for  selected  macroblocks  from  static 
sequences  is  shown  in  Figure  EQ-17.  This  process  differs  from  the  motion  video  process 
in  three  ways.  JPEG-based  quantization  is  not  used;  all  sixteen  subbands  are  supplied  to 
one  of  three  independent  uniform  quantizers  with  fixed  step  sizes  of  qi,  q2,  and  q3. 
Contrary  to  the  motion  macroblock  scheme,  the  step  sizes  are  not  associated  with  layers, 
but  with  the  partitions  depicted  in  Figure  III- 15.  Finally,  the  VLC  table  that  is  provided 
in  Appendix  B  and  customized  for  static  slides  is  used.  The  same  three  functions  cited 
above  accomplish  these  steps  with  the  appropriate  VLC  table  being  employed  via  the 
argument  kind  in  the  function  get_bits2.m  signaling  the  type  of  frame  under 
consideration  -  motion  video  or  static  slide. 
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Figure  111-17:  Quantization  and  Coding  for  Static  Macroblocks. 

Neither  Figure  HI- 16  nor  Figure  IE- 17  indicates  the  presence  of  the  control  signal 
from  the  Control  Unit  shown  in  Figure  III-3.  The  control  signal  allows  manipulation  of 
ql,  q2,  and  q3  as  required  by  a  bit  rate  control  scheme.  Rate  control  is  covered  later. 

5.         Generating  Customized  VLC  Tables 

The  VLC  coding  scheme  mirrors  the  3-D  event  structure  employed  by  the  H.263 
standard.  Each  non-zero  coefficient  is  replaced  by  an  equivalent  event  described  by  three 
RLE  parameters  [26]: {LAST,  RUN,  LEVEL},  where  LAST  indicates  whether  there  are 
any  more  non-zero  coefficients  in  the  current  subband,  RUN  indicates  the  number  of 
successive  zeros  that  precede  the  non-zero  coefficient,  and  LEVEL  represents  the  non- 
zero magnitude  of  the  quantized  coefficient.  Each  event  maps  to  a  VLC  codeword  to 
which  a  sign  bit  is  appended  to  represent  the  sign  of  the  coefficient. 

Using  various  combinations  of  qi,  q2,  and  q3,  a  series  of  representative  motion 
video  test  sequences  was  processed.  The  processing  followed  the  implementation  as 
discussed  thus  far  (except  for  the  last  step  of  mapping  each  codeword  obtained  form  RLE 
to  a  VLC  table.)  The  relative  frequency  of  occurrence  of  each  RLE  codeword  was  then 
used  to  create  a  Huffman  VLC  utilizing  an  optimal  binary-branching  tree.  Since  every 
possible  {LAST,  RUN,  LEVEL}  event  is  not  guaranteed  to  be  formed  by  the  set  of  test 
video  sequences,  the  custom  VLC  table  further  mirrors  the  H.263  standard  in  that  a 
default  codeword  length  of  22  bits  is  used  for  any  event  not  contained  elsewhere  in  the 
table. 
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A  Huffman  code  offers  the  following  advantages.  The  average  codeword  length 
is  minimized  because  the  more  frequently  occurring  events  are  assigned  shorter 
codewords  (fewer  bits)  while  the  less  frequently  occurring  codewords  are  assigned  longer 
codewords  (more  bits).  Additionally,  each  codeword  is  uniquely  decipherable  ensuring 
that  no  codeword  can  be  a  prefix  to  a  longer  codeword. 

This  same  procedure  is  used  here  for  both  motion  video  sequences  and  static  slide 
sequences.  Again,  using  different  VLC  tables  for  the  two  basic  types  of  video  content  is 
advantageous  because  of  the  inherent  difference  between  their  frequency  content  and, 
consequently,  the  different  makeup  of  the  RLE  codeword  population. 

6.  Rate-Distortion  Relationship 

Compressed  video  is  inherently  variable  bit  rate  since  compression  gain  varies 
with  scene  activity  and  complexity.  However,  transmission  channels  inevitably  require  a 
constraint  on  bit  rate  due  to  finite  channel  capacity  or  QoS  guarantees.  Most  commonly, 
bit  rate  is  constrained  to  maintain  a  constant  rate  or  to  maintain  a  constant  local-average 
bit  rate  over  time.  Many  factors  affect  bit  rate,  but  the  most  important  is  the  tradeoff 
made  between  quantizer  step  size  and  image  fidelity.  A  larger  step  size  results  in  a  lower 
bit  rate  and  a  larger  amount  of  distortion.  Reducing  the  step  size  increases  the  bit  rate  but 
reduces  the  amount  of  distortion.  Rate  control  therefore  requires  evaluation  of  the  rate- 
distortion  relationship  created  by  a  particular  coder  design. 

The  rate  control  problem  may  be  posed  as  a  resource  allocation  problem  in  terms 
of  the  rate-distortion  relationship,  where  the  goal  is  to  minimize  distortion  D  for  a  bit  rate 
R  subject  to  a  bit  rate  constraint  Rc  [24],  i.e.,  min{Z)},  subject  to  R  <  Rc.  The 
corresponding  optimization  problem  is  solved  using  Lagrangian  methods  and  yields  the 
optimal  solution  for  a  particular  rate  constraint  as  a  point  along  the  rate-distortion  curve. 
Figure  III-l  8  shows  a  typical  rate-distortion  curve  and  an  optimal  solution  for  a  bit  rate  of 
i?o-  While  the  true  rate-distortion  curve  is  guaranteed  to  be  convex  [17],  the  operational 
curve  is  influenced  by  the  coder  design  including  the  motion-detection  scheme,  the 
quantizer  design,  and  lossless  coding  gains.  Therefore,  rate  control  schemes  tend  to  only 
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approximate  the  true  rate-distortion  relationship  when  determining  a  method  for  varying 
quantizer  step  size  to  achieve  the  desired  bit  rate. 
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Figure  111-18:  Rate  Distortion  Curve  with  Possible  Optimal  Solution. 

Additionally,  for  a  layered  coder  where  multiple  quantizer  parameters  are 
employed,  the  corresponding  multi-dimensional  aspect  of  the  rate-distortion  curve 
complicates  the  rate-control  problem.  Assuming  distortion  for  each  layer  i  is  additive, 
the  rate  control  problem  becomes  minimizing 
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where  N  is  the  total  number  of  layers.  The  assumption  of  additive  distortion  implies  that 
a  decrease  in  rate  requires  a  suitable  decrease  in  all  quantizer  parameters  to  yield  an 
optimal  solution.  However,  since  the  rate-distortion  curves  in  the  operational  coder  are 
not  necessarily  convex,  the  approach  above  does  not  necessarily  give  optimal  results.  An 
alternate,  albeit  heuristic,  approach  proposed  by  Parker  [9]  is  to  simplify  the  control 
problem  by  creating  a  simplified,  operational  rate-distortion  curve. 

Considering  the  test  motion  video  sequences,  an  operational  distortion  curve  is 
created  by  first  plotting  total  bit  rate  and  distortion  (measured  by  pSNR)  separately 
through  a  three-dimensional  space  spanned  by  the  set  of  candidate  quantizers.  This 
process  captures  the  operational  effect  of  the  coder  design,  such  as  the  values  of  the 
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quantizer  parameters  {qi,q2,q3}  and  the  VLC  coding  gain  as  well  as  any  interdependence 
between  layers,  on  the  rate-distortion  relationship.  The  result  is  best  described  as  a  four- 
dimensional  (4-D)  surface  wherein  both  rate  and  distortion  are  functions  of  a  triplet  of 
quantizer  parameters  {qi,q2,q3}.  Recall  that  the  first  parameter  represents  the  JPEG 
scaling  factor  while  the  remaining  parameters  represent  the  actual  quantizer  step  sizes. 
Next,  the  points  representing  the  pSNR  surface  are  sorted  in  a  descending  order 
and  associated  with  their  corresponding  average  bit  rates  and  quantizer  triplets.  Any 
triplet  set  yielding  a  higher  average  bit  rate  for  the  same  or  lower  pSNR  is  discarded.  The 
result  is  an  implicit  vector  quantization  of  the  operational  3-D  rate-distortion  space.  The 
dimensionality  of  the  operational  rate-distortion  curve  is  therefore  reduced  to  1-D  as 
shown  in  Figure  JTJ-19.  Each  point  on  the  curve  represents  results  from  a  single  optimal 
triplet.  Considering  only  those  quantizer  triplets  associated  with  average  bit  rates  about 
the  target  bit  rates  of  64-96  kbps  and  considering  a  5%  change  in  the  average  bit  rate  of 
the  coarsest  quantizer  triplet  as  a  reasonable  control  step,  the  operational  rate-distortion 
curve  of  Figure  HI- 19  reduces  to  that  shown  in  Figure  JJI-20.  The  corresponding 
quantizer  triplets  are  plotted  in  Figure  JTi-21.  These  results  indicate  that  an  optimal  rate 
control  scheme  does  not  necessarily  increase/decrease  each  quantizer  parameter  in 
lockstep  as  would  be  expected  if  distortion  in  each  layer  were  independent. 
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Figure  111-19:  Operational  Rate  Distortion  Curve  (Motion  Video). 
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Figure  111-20:  Reduced  Operational  Rate-Distortion  Curve  (Motion  Video). 
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Figure  111-21:  Quantizer  Table  Triplet  Values  (Motion  Video). 

The  same  approach  was  followed  for  static  slide  sequences.  However,  a  slightly 
different  behavior  was  observed.  Instead  of  the  expected  continued  decrease  in  MSE  as 
bit  rate  increased  with  finer  quantization,  the  behavior  was  as  depicted  in  Figure  ni-22. 
Past  a  limiting  value  of  MSE,  finer  quantization  and  the  related  increase  in  bit  rate 
yielded  no  increase  in  image  fidelity.  After  reconstructing  the  custom  VLC  table  with  the 
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quantizer  triplet  at  that  limiting  point  ({qi:q2:q3}  equal  to  {4:16:16})  the  average  bit  rate 
became  approximately  44  kbps  as  shown.  Since  this  bit  rate  is  below  64  kbps,  no 
additional  bit  rate  control  scheme  was  deemed  necessary;  all  static  slide  sequences  are 
quantized  with  the  same  triplet. 
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Figure  111-22:  Operational  Rate-Distortion  Curve  (Static  Slides). 


Bit  Rate  Control 


This  approach  to  the  rate-distortion  problem  provides  a  potential  method  for  a 
simplified  layered  rate  control  scheme  since  the  set  of  possible  quantizer  parameters  is 
reduced  to  a  far  smaller  set  of  parameters.  Considering  each  triplet  as  an  optimal 
quantizer  state,  any  control  scheme  would  manipulate  the  quantizers  for  each  layer  of  a 
motion  video  sequence  by  selecting  only  entries  from  this  set  via  a  simple  table  lookup. 
Parker  [9]  proposes  two  such  schemes.  One  functions  at  the  frame  level;  the  other 
operates  at  the  macroblock  level.  The  former  was  examined  and  implemented  for  this 
thesis  in  the  coder. 

Using  the  operational  rate-distortion  curve,  a  linear  control  curve  relating  bits  per 
frame  B  to  quantizer  setting  Q  is  created  as  shown  in  Figure  ni-23.  The  slope  AB/AQ 
represents  the  average  increment  or  decrement  in  bits  per  frame  with  a  step  change  in  the 
quantizer  table.  Dividing  this  quantity  by  the  average  number  of  macroblocks  selected 
per  frame  in  the  test  sequences,  M  ,  yields  the  desired  control  parameter  0: 
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The  calculated  value  of  (3  resulting  from  the  test  motion  video  sequences  was  -1 1 .346. 
This  control  parameter  was  then  used  to  adjust  the  coder  quantizer  setting  in  accordance 
with  the  following  scheme. 
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Figure  111-23:  Operational  Rate  Control  Curve  (Motion  Video). 

At  call  setup,  the  average  bit  allocation  per  frame  B  is  set  to 

-     R, 


B  = 


target 

7" 


(III- 10) 


where  i?target  is  the  channel  bit  rate,  and/is  the  frame  rate.  For  each  new  frame  i,  the 

actual  bit  allocation  from  the  last  frame  (/-l)  is  used  to  estimate  the  bit  allocation  error  or 
deviation  expected  to  result  from  the  current  frame  i  if  the  quantizer  setting  used  in  the 
previous  frame  is  not  changed.  Accounting  for  the  change  in  the  number  of  macroblocks 
selected  between  the  last  and  current  frames,  the  deviation  expected  is: 
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where  M,  is  the  number  of  macroblocks  selected  for  transmission  in  the  current  frame, 
Mm  is  the  number  of  macroblocks  selected  for  transmission  in  the  previous  frame,  and 
Z?._,  is  the  number  of  bits  used  in  the  transmission  of  the  previous  frame.  The  required 

change  in  the  quantizer  setting  is  calculated  using  the  deviation  Afl  inter,  the  number  of 
macroblocks  selected  for  transmission  in  the  current  frame  M„  and  the  control  parameter 

(3: 

A£._ 


A0,= 


(m-12) 


Here,  |_  J  is  the  fixed  integer  operator  that  discards  the  decimal  portion  of  the  result.  The 
result  indicates  the  quantizer  setting  from  the  last  frame  should  be  incremented  or 
decremented  by  A<2,  ■  If  the  quantizer  has  reached  the  upper  or  lower  limit  of  the  table, 
the  value  is  not  changed.  This  quantizer  triplet  selection  scheme  is  accomplished  via  the 
function,  get_qd_entryf.m. 

This  control  scheme  is  only  applicable  for  motion  video;  a  single  quantizer 
parameter  triplet  is  used  for  static  slides.  The  only  exception  is  due  to  a  scene  change. 
When  a  scene  change  is  detected  by  the  coder  (as  defined  in  the  next  section),  so  much  of 
the  frame  is  selected  for  transmission  that  a  spike  in  bit  rate  would  occur  if  quantized 
with  any  of  the  triplets  available  in  the  control  table.  To  avoid  this  undesirable  spike,  the 
first  frame  of  a  new  scene  is  heavily  compressed.  Following  this  initial  frame,  the 
appropriate  quantization  technique  ensues. 

8.         Scene  Change  Detection  and  Scene  Type  Determination 

Since  the  bit  rate  must  be  suppressed  during  a  scene  change  and  the  coder  must 
determine  which  of  two  possible  layering  schemes  to  employ  following  a  scene  change, 
these  criteria  must  be  defined.  The  coder  concludes  that  a  scene  change  has  occurred  if 
the  number  of  macroblocks  selected  exceeds  a  threshold.  This  threshold  was  determined 
from  the  block  selection  statistics  of  the  test  video  sequence  containing  the  most  highly 
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active  content  (though  still  considered  low-motion  video)  of  all  the  test  video  sequences 
examined.  For  this  sequence,  the  average  number  of  macroblocks  selected  per  frame  was 
34.61  and  their  standard  deviation  was  10.19.  The  threshold  was  set  at  three  standard 
deviations  above  the  mean  (65).  The  comparison  operation  is  performed  in  the  main 
code  block,  thesis.m.  A  frame  sequence  is  determined  to  be  static  when  macroblocks 
chosen  for  transmission  result  solely  from  aging.  In  this  case,  the  fixed  quantizer  triplet 
and  the  static  VLC  table  from  Appendix  B  are  used.  Otherwise,  the  sequence  is  deemed 
motion  video,  and  the  bit  rate  control  scheme  discussed  above  is  employed  with  its 
associated  custom  VLC  table  from  Appendix  B.  This  determination  is  performed  within 
the  function,  m_blk_id_xr.m. 

This  chapter  began  with  a  presentation  of  the  three  basic  techniques  available  for 
layered  video  coding.  The  approach  implemented  here,  as  proposed  by  Parker  [9],  is 
frequency-based  and  utilizes  the  FHT  and  the  2-D  DCT  for  motion  video  sequences  and 
the  FHT  alone  for  static  slide  sequences.  Different  frequency  transforms  are  used  due  to 
the  inherent  differences  in  the  perceptual  frequency  content  within  the  two  types  of 
sequences.  The  method  of  implementing  frame  refreshment  was  detailed  as  a  block 
selection  scheme  applied  at  the  macroblock  level  that  captures  perceptual  changes  due  to 
motion  within  a  scene  and  limits  the  duration  of  decoder  errors  at  receivers  by  forcing 
macroblock  updates  via  an  aging  algorithm.  Quantization  and  encoding  techniques  were 
presented  with  the  implementation  including  the  use  of  the  JPEG  standard  and  uniform 
quantization  coupled  with  one  of  two  custom  VLC  tables.  The  issue  of  rate  control  was 
addressed,  and  the  implementation  of  a  scheme  that  reduces  a  4-D  rate  control  surface  to 
a  simple  quantizer  table  lookup  to  control  bit  rate  at  the  frame  level  was  discussed. 
Finally,  the  manner  by  which  the  coder  detects  a  scene  change  and  determines  the  type  of 
sequence  under  consideration  was  delineated. 
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IV. 


RESULTS 


This  chapter  presents  some  results  from  a  short  video  segment  consisting  of  100 
frames  of  a  single  speaker  followed  by  50  frames  of  a  presentation  slide  filled  with  line 
diagrams  and  text.  The  Matlab  code  is  contained  in  Appendix  C.  A  sample  frame  from 
each  sequence  is  shown  in  Figure  IV- 1  and  Figure  IV-2.  Each  shows  the  original  frame 
and  the  reconstructed  frame  with  only  the  base  layer  received,  with  the  base  layer  and  the 
first  enhancement  layer  received,  and  with  all  layers  received. 


Original  Frame 


Layers  1 ,  2,  and  3 


I      ""* 

Figure  IV- 1:  Original  and  Reconstructed  Frames  from  a  Motion  Video  Sequence. 
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Figure  IV-2:  Original  and  reconstructed  Frames  from  a  Static  Video  Sequence. 

Figure  IV-3  shows  the  bit  rate  traces  for  the  150-frame  layered  video  segment. 
Part  (a)  displays  the  traces  resulting  from  the  constant  motion  video  quantizer  triplet 
({6,12,20})  expected  to  yield  80  kbps  based  on  the  average  bit  rate  resulting  form  all  test 
motion  video  sequences  and  the  fixed  static  sequence  quantizer  triplet.  Part  (b)  displays 
the  traces  resulting  from  the  bit  rate  control  scheme  for  the  motion  video  frames  with  the 
coder  attempting  to  achieve  an  average  80  kbps  and  the  fixed  static  sequence  quantizer 
triplet.  Bit  rate  spike  suppression  is  employed  for  the  initial  frame  of  each  scene  as 
discussed  previously.  The  distribution  of  bit  rate  offered  by  a  layered  video  coder  is 
evident  with  the  bit  rate  ratio  among  layers  being  approximately  5:3:2  for  both  sequences. 
As  congestion  occurs  in  the  network,  the  higher  layers  can  be  dropped  to  combat  the 
congestion  while  maintaining  much  of  the  quality  as  illustrated  in  Figure  IV- 1  and  Figure 
IV-2.  Neglecting  the  initial  frame,  the  average  bits  per  frame  for  the  motion  video 
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sequence  without  the  control  scheme  is  7454  bpf  with  a  standard  deviation  of  1362  bpf. 
With  the  control  scheme,  the  average  and  standard  deviation  are  7988  bpf  and  942  bpf, 
respectively.  As  expected,  the  bit  rate  from  the  static  sequence  is  much  lower  since  the 
bit  rate  results  solely  form  macroblock  aging.  This  illustrates  that  rate  control  is  not  of 
significant  benefit  for  static  sequences. 
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Figure  IV-3:  Bit  Rates  for  (a)  Fixed  Quantization  and  (b)  Bit  Rate  Control. 

Figure  IV-4  quantifies  the  progressive  improvement  in  quality  of  the 
reconstructed  video  based  on  pSNR  for  the  same  1 50-frame  layered  video  segment 
illustrated  in  Figure  IV- 1  and  Figure  IV-2.  At  the  beginning  of  each  sequence,  quality 
ramps  up  over  the  aging  interval  following  a  scene  change.  After  this  period,  quality  is 
observed  to  remain  relatively  flat  for  each  sequence  regardless  of  the  number  of  layers. 
For  the  motion  video  sequence,  the  base  layer  provides  a  smoothed  but  acceptable 
display.  Text  is  not  readable  but  the  speaker's  movements  are  easy  to  follow.  Adding 
the  first  enhancement  layer  improves  sharpness  and  adds  a  4  dB  improvement  in  pSNR 
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although  small  text  is  still  difficult  to  discern.  The  second  enhancement  layer  only  adds 
1-2  dB  improvement,  but  small  text  is  clearly  readable  and  other  features  with  fine  edges 
are  sharper.  With  static  video,  the  role  of  the  enhancement  layers  is  even  more  dramatic. 
Even  though  most  of  the  macroblock's  energy  is  included  in  the  base  layer  and 
contributions  from  each  frequency  band  are  included,  the  base  layer  still  shows  much 
softness  although  the  shapes  are  readily  identifiable.  The  first  enhancement  layer  adds  a 
7  dB  improvement  and  dramatically  improves  sharpness.  The  final  layer,  even  though 
the  bit  rate  contribution  is  the  smallest  of  the  three  layers,  almost  doubles  the  pSNR,  and 
the  reconstructed  frame  is  virtually  identical  to  the  original  frame.  Neglecting  the  initial 
frame,  the  average  pSNR  utilizing  all  layers  for  the  motion  video  sequence  without  the 
control  scheme  is  29.5  dB  with  a  standard  deviation  of  1.7  dB.  With  the  control  scheme, 
the  average  and  standard  deviation  are  29.8  dB  and  1.9  dB,  respectively.  Note  that  these 
values  include  the  ramp  up  in  quality  following  the  initial  scene  change.  These  average 
pSNR  values  are  not  directly  comparable,  however,  because  of  the  difference  in  average 
bit  rates;  the  higher  quality,  rate  control  approach  uses  approximately  500  more  bits  per 
frame.  But  since  this  quality  is  achieved  within  the  desired  bit  budget  of  80  kbps,  the  rate 
controller  allows  more  effective  utilization  of  the  available  bandwidth  to  better  image 
fidelity.  The  statistics  obtained  from  the  motion  video  sequence  traces  in  Figure  IV-3 
and  Figure  IV-4  are  summarized  in  Table  IV- 1. 
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Figure  IV-4:  pSNR  for  (a)  Fixed  Quantization  and  (b)  Bit  Rate  Control. 


Parameter 


With  Rate  Control      Without  Rate  Control 


Mean  Bit  Rate  (bpf) 

Bit  Rate  STD  (bpf) 

Mean  pSNR  (dB) 

pSNR  STD  (dB) 


7998 
942 

29.83 
1.92 


7454 
1362 

29.51 
1.74 


Table  IV-1:  Rate  Controlled  and  Uncontrolled  Motion  Video  Sequence  Statistics. 

Additionally,  the  issue  of  the  layered  video  codec's  resilience  to  bit  errors 
introduced  during  transmission  was  examined  using  the  same  motion  video  sequence  as 
above.  Using  loss  rates  of  10%,  25%,  and  50%,  four  different  case  were  tested.  Case  one 
distributed  the  bit  errors  across  the  layers  in  proportion  to  their  contribution  to  the  total 
bit  rate  and  utilized  zero-order  error  concealment;  the  reconstructed  frame  retained  the 
content  of  the  previous  frame  for  that  portion  lost  during  transmission.  If  the  loss 
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occurred  in  the  base  layer,  the  enhancement  layers  were  neglected.  If  the  loss  occurred  in 
an  enhancement  layer,  the  reconstruction  was  performed  utilizing  the  base  layer  and  the 
other  enhancement  layer.  Case  two  treated  all  layers  as  a  single  video  stream  and  utilized 
zero-order  error  concealment;  that  is,  a  transmission  loss  was  a  loss  for  all  layers.  Cases 
three  and  four  are  identical  to  cases  one  and  two,  respectively,  except  that  no  error 
concealment  was  used.  Instead  an  information  loss  caused  the  decoder  to  assign  the 
value  of  zero  to  all  coefficients  in  the  affected  layers.  As  Figure  IV-5  illustrates, 
spreading  bit  errors  across  multiple  layers  has  less  negative  impact  on  the  reconstructed 
image  at  high  loss  rates. 
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Figure  IV-5:  Comparison  of  Error  Resilience. 

Considering  static  slide  sequences,  the  relationship  is  more  complicated.  In 
general,  since  the  frame  content  is  entirely  static,  the  best  reconstruction  in  an  error  prone 
environment  is  to  forgo  a  macroblock  update  if  the  update  would  be  made  with  fewer 
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layers  than  the  present  reconstruction  of  that  macroblock.  Further  study  into  the 
implementation  of  such  a  scheme  is  warranted. 

This  chapter  presented  representative  frames  from  a  video  segment  consisting  of 
motion  video  and  static  slides.  For  each  type  of  content,  the  original  frame  and  the 
reconstructions  with  one,  two,  and  three  layers  were  given.  Also  presented  were  plots  of 
the  bits  per  frame  and  pSNR  as  a  function  of  the  frame  sequence  for  both  fixed 
quantization  and  variable  quantization  using  the  bit  rate  control  scheme.  These  plots 
served  to  quantify  the  quality  depicted  in  the  reconstructions  and  to  illustrate  the  bit 
allocation  among  the  layers.  Finally,  the  effect  of  spreading  bit  errors  across  layers 
compared  to  confining  them  to  a  single  video  stream  was  presented. 
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V.    CONCLUDING  REMARKS 

A.  CONCLUSIONS 

This  thesis  has  presented  the  development  and  implementation  of  a  new  layered 
video  codec  proposed  by  Parker  [9]  that  emphasizes  robust  transmission  of  a  video 
teleconference  (VTC)  at  low  bit  rates.  The  dual  nature  of  the  targeted  scene  content  - 
low-motion  video  and  static  slide  sequences  consisting  of  text  and  line  drawings  -  and 
the  assumed  application  in  a  multicast,  heterogeneous,  wireless  network  environment 
required  significant  flexibility  in  the  implementation.  The  essential  features  of  the  coder 
are  summarized  as  follows. 

Frame  refreshment  is  accomplished  via  block  updating  and  an  aging  algorithm, 
both  applied  at  the  macroblock  level.  This  approach  promotes  greater  robustness  because 
spatial  error  propagation  is  eliminated  and  temporal  error  propagation  is  greatly  limited. 
The  combined  technique  captures  perceptual  changes  due  to  motion  within  the  scene, 
limits  the  duration  of  error  artifacts  in  the  reconstruction  at  receivers,  and  ensures  that 
new  participants  in  a  VTC  session  that  is  already  in  progress  receive  a  complete  frame  in 
a  timely  manner. 

The  macroblocks  selected  for  transmission  are  decomposed  in  frequency  using  the 
fast  Haar  transform  (FHT).  For  motion  video  sequences,  the  lowpass  subband  is  further 
processed  with  the  two-dimensional  discreet  cosine  tranform.  The  horizontal  and  vertical 
edge  detail  subbands  are  further  decomposed  with  a  repeated  application  of  the  FHT. 
Static  slide  sequences  are  decomposed  solely  by  a  second  order  FHT  analysis. 

The  lowpass  subband  of  motion  video  is  quantized  and  encoded  using  the  JPEG 
standard  in  order  to  exploit  the  human  visual  system  perceptive  characteristics.  The 
remaining  subbands  of  motion  video  sequences  are  subjected  to  uniform  quantization  and 
encoding  with  a  custom  variable  length  coding  (VLC)  table.  All  subbands  of  static  slides 
are  subjected  to  uniform  quantization  and  encoding  with  a  separate  custom  VLC  table. 
All  quantization  is  performed  using  a  triplet  of  quantizers,  and  each  subband  is  quantized 
with  one  of  the  triplet  parameters  based  on  the  variance  of  subband  coefficients. 
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Subbands  are  assigned  to  layers  by  grouping  bands  of  coefficients  with  similar 
variances  into  a  common  layer.  Three  layers  are  used  in  the  coder.  The  base  layer  is 
independently  decodable  and  yields  an  acceptable,  minimum-quality  reconstruction;  each 
enhancement  layer  progressively  improves  the  quality  of  the  reconstruction. 

Bit  rate  is  controlled  at  the  frame  level  by  selecting  the  quantizer  triplet  to  be  used 
in  the  current  frame  based  on  the  number  of  bits  used  in  the  previous  frame,  the  desired 
average  bit  rate,  and  the  number  of  macroblocks  selected  for  transmission.  The 
implementation  involves  a  simple  table  lookup,  which  resulted  from  the  optimal  one- 
dimensional  reduction  of  a  four-dimensional  control  surface. 

B.  FUTURE  WORK 

The  implementation  and  the  results  presented  here  suggest  that  the  layered  video 
codec  has  potential  practical  utility  to  video  teleconferencing  in  multicast,  heterogeneous, 
wireless  networks.  Now,  several  aspects  of  the  coder  can  be  pursued  further.  For 
example,  although  three  layers  were  used  in  the  present  implementation,  the  techniques 
employed  can  be  used  to  scale  the  coder  to  include  an  arbitrary  number  of  layers.  Work 
on  techniques  to  dynamically  change  the  layering  scheme  within  a  sequence  is  desirable. 
The  ability  to  handle  color  and  audio  needs  to  be  incorporated  into  the  code.  Further 
refinement  of  the  block  search  pattern  utilized  for  macroblock  selection  and  the 
possibility  of  rate  control  at  the  macroblock  level  can  be  evaluated.  Also,  with  regard  to 
static  slide  sequences,  investigation  of  the  ability  to  detect  and  reconstruct  slight 
movement  within  such  a  frame,  such  as  the  movement  of  a  cursor,  is  warranted.  Finally, 
implementation  in  a  high  level  language  or  hardware  is  another  potential  future  task. 
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APPENDIX  A.  MOTION  VIDEO  VARIABLE  LENGTH  CODING  TABLE  (VLC) 

The  following  is  the  custom  VLC  table  used  with  motion  video  sequences.  The 
last  character  in  the  codeword  s  indicates  the  appended  sign  bit. 


INDEX 

LAST 

RUN 

LEVEL 

BITS 

CODEWORD 

1 

0 

0 

1 

3 

1  1  s 

2 

0 

0 

2 

5 

0101s 

3 

0 

0 

3 

6 

10010s 

4 

0 

0 

4 

9 

1 01 0001  1  s 

5 

0 

0 

5 

12 

10111001101s 

6 

0 

0 

6 

14 

1011111001110s 

7 

0 

1 

5 

01  00s 

8 

0 

2 

7 

100111s 

9 

0 

3 

10 

001 0001 01 s 

10 

0 

4 

12 

1011111 0000s 

11 

0 

5 

17 

101101001 00001  1  1  s 

12 

0 

6 

14 

10110100101 00s 

13 

0 

2 

1 

5 

0000s 

14 

0 

2 

2 

8 

1011001s 

15 

0 

2 

3 

12 

0111 0001 01 Os 

16 

0 

2 

4 

13 

001 0001 10110s 

17 

0 

3 

1 

6 

01111s 

18 

0 

3 

2 

9 

011001 00s 

19 

0 

3 

3 

11 

1011100101s 

20 

0 

3 

4 

13 

101101011001s 

21 

0 

4 

1 

6 

0001  1  S 

22 

0 

4 

2 

9 

01110010s 

23 

0 

4 

3 

10 

101101010s 

24 

0 

4 

4 

15 

00101101001111s 

25 

0 

5 

1 

6 

10010s 

26 

0 

5 

2 

10 

0001 01101s 

27 

0 

5 

3 

12 

1011111 0001  s 

28 

0 

5 

4 

16 

101101001 0001  1  1  s 

29 

0 

6 

1 

7 

011101s 

30 

0 

6 

2 

10 

0001 01110s 

31 

0 

6 

3 

14 

0111 0001 00000s 

32 

0 

6 

4 

16 

101101001 0001  1  Os 

33 

0 

7 

1 

7 

011011s 

34 

0 

7 

2 

10 

001 0001 00s 

35 

0 

7 

3 

14 

1011111001111s 

36 

0 

7 

4 

16 

101101001 000001  s 

37 

0 

8 

1 

7 

01 1 000s 

38 

0 

8 

2 

9 

001001 00s 

39 

0 

8 

3 

10 

001011011s 

40 

0 

8 

4 

13 

0111 0001 0001  s 

41 

0 

9 

1 

8 

0010111s 

42 

0 

9 

2 

11 

0010011010s 

43 

0 

9 

3 

14 

1001101011 000s 

44 

0 

10 

1 

8 

1011011s 

45 

0 

10 

2 

11 

1001101001s 

46 

0 

10 

3 

14 

0010011011010s 

47 

0 

10 

5 

16 

101101001 000000s 

63 


48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 

61 

62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 

81 

82 

83 

84 

85 

86 

87 

88 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 

100 

101 

102 
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0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
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0 
0 
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0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
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0 
0 
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0 
0 
0 
0 
0 
0 
0 
0 
0 
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11 
11 
11 

12 
12 
12 
12 
13 
13 
13 
14 
14 
14 
15 
15 
16 
16 
17 
17 
18 
19 
20 
20 
21 
22 
22 
23 
23 
24 
24 
24 
25 
25 
26 
26 
27 
27 
28 
28 
29 
29 
30 
30 
31 
31 
32 
32 
33 
33 
34 
34 
35 
35 
36 
36 
37 


9 

12 
14 
8 
10 
13 
15 
9 

12 
15 
9 

13 
17 
10 
14 
10 
15 
11 
17 
11 
11 
11 
17 
12 
12 
16 
12 
15 
11 
16 
17 
12 
16 
12 
17 
12 
17 
12 
17 
12 
17 
12 
15 
12 
17 
12 
14 
13 
17 
13 
17 
13 
17 
12 
17 
13 


001 
001 
1  01 
001 
01  1 
1  01 
000 
1  01 
1  01 
001 
001 
001 
1  01 
01  1 
1  01 
1  01 
1  00 
000 
01  1 
1  01 
1  01 
1  01 
01  1 
01  1 
01  1 
1  01 
000 
001 
001 
1  01 
1  01 
001 
1  01 
1  01 
1  01 
000 
1  01 
001 
1  01 
1  01 
1  01 
1  01 
1  01 
1  00 
1  01 
1  00 
1  01 
1  01 
1  01 
001 
1  01 
1  01 
1  01 
1  00 
1  01 
000 


01  1 
000 
1  01 
000 
1  00 

1 1 1 

1  01 
1  1  0 

1 1 1 

01  1 
001 
000 
1  01 
001 
1  01 
000 
1  1  0 
1  01 
001 
000 
000 
1  01 
001 
001 
1  00 
1  01 
1  01 
01  1 
000 
1  01 
1  01 
001 
1  01 

1 1 1 

1  01 
1  01 
1  01 
000 
1  01 

1 1 1 

1  01 

1  01 

1  01 

1  1  0 

1  01 

1  1  0 

1  01 

1 1 1 

1  01 
000 
1  01 

1 1 1 

1  01 

1  1  0 

1  01 

1  01 


00s 

1 1 1 

001 
0s 
01  1 
1  00 
1  00 
00s 

1 1 1 

01  0 
01  s 
1  1  0 
001 
01  0 
000 
1  00 

11 1 
1 1 1 

01  1 
1  01 
1  01 
001 
01  1 
01  1 
01  0 
001 

1 1 1 

01  0 
1  1  0 
001 
001 
1  00 
001 
1  1  0 
000 
1  00 
000 

1 1 1 

000 
1  1  0 
001 
01  1 
000 

1 1 1 

001 
1  01 
000 

1 1 1 

001 

1 1 1 

001 
1  1  0 
001 
1  01 
001 

1 1 1 


00s 

0  10  1s 

s 

1  00s 
11101s 

1  1  s 

01 1 00s 


1 1 1 

001 
s 

1  01 
s 
1  00 

OS 

001 
1  s 
Os 
1  s 
001 

01  S 

1  1  s 
01  1 
1  1  s 
01  1 
Os 
01  1 
000 
1  1  s 
01  1 
01  s 
01  0 
00s 
01  0 
1  Os 
01  0 
00s 
001 
1  Os 
01  0 
00s 
001 
01  s 
1  01 
1  00 
001 
1  1  0 
001 
1  00 
001 
00s 
001 
1  00 


1010s 
1  s 
1  Os 
1111s 

1110s 

101s 

01  s 

1  00s 
01  00s 

111s 

1  000s 

1011s 

1010s 

0101s 

00s 

01  00s 

Os 

s 

0001  s 

s 

0000s 

s 

0011s 

001  Os 
s 
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104 
105 
106 
107 
108 
109 
110 
111 
112 
113 
114 
115 
116 
117 
118 
119 
120 
121 
122 
123 
124 
125 
126 
127 
128 
129 
130 
131 
132 
133 
134 
135 
136 
137 
138 
139 
140 
141 
142 
143 
144 
145 
146 
147 
148 
149 
150 
151 
152 
153 
154 
155 
156 
157 
158 
159 


38 
39 
40 
40 
41 
41 
42 
43 
43 
44 
45 
46 
46 
47 
48 
48 
49 
50 
51 
52 
53 
53 
54 
54 
55 
56 
56 
57 
57 
58 
59 
60 
61 
62 
0 
0 
0 
0 
0 


2 
2 
2 
2 
3 
3 
3 
3 
4 
4 
4 
4 


12 

12 
13 
16 
13 
17 
14 
13 
17 
13 
12 
14 
16 
13 
12 
15 
13 
13 
14 
14 
14 
16 
14 
16 
14 
13 
15 
14 
16 
15 
16 
16 
17 
15 
5 
7 
8 
11 
15 
6 
10 
13 
15 
17 
7 
11 
13 
16 
6 
10 
13 
16 
7 
12 
16 
16 


1  01 
1  01 
1  01 
001 
1  01 
1  01 
01  1 
01  1 
1  01 
1  01 
1  01 
01  1 
1  01 
1  00 
1  00 
000 
01  1 
1  01 
1  01 
01  1 
1  00 
1  01 
1  01 
001 
000 
1  01 
000 
000 
001 
1  00 
01  1 
000 
000 
001 
001 
01  1 
1  01 
001 
001 
1  00 
1  01 
1  01 
1  01 
1  01 
001 
001 
1  00 
1  01 
1  01 
1  00 
001 
001 
1  01 
001 
1  01 
1  01 


1  1  0 
1  01 

1 1 1 

01  1 

1  1  0 
1  01 
1  00 
001 
1  01 
1  1  0 
1  01 
1  00 
1  01 
1  1  0 
1  1  0 
1  01 
001 
1  01 
1  1  0 
1  00 
1  1  0 
1  01 

1 1 1 

01  1 
1  01 

1 1 1 

1  01 
1  01 
01  1 
1  1  0 
001 
1  01 
1  01 
01  1 
1  s 
01  0 
1  00 
001 
01  1 
01  s 

1 1 1 
1 1 1 

1  01 
1  01 
01  0 
01  1 
1  1  0 
1  01 
01  s 
1  1  0 
001 
01  1 
001 
001 
1  01 
1  01 


01  1 
000 

1 1 1 

01  0 
01  1 
001 
01  0 
01  1 
001 
01  0 
01  1 
01  0 
001 

1 1 1 

1  00 
1  00 
01  1 
000 
01  1 
01  0 
1  01 
001 
1  00 
01  0 
1  00 

1 1 1 

1  00 
1  00 
01  0 

1 1 1 

01  1 
1  00 
1  00 
01  0 


1  0s 
1  1  s 
001 
01  0 
1  1  0 
001 
01  0 
000 
001 
001 
01  s 
01  1 
01  1 
01  1 
00s 
1  00 

1 1 1 

01  1 
001 
01  1 
1  1  0 
01  1 
1  1  0 
01  0 
1  1  0 
000 
1  00 
1  1  0 
01  0 
1  00 
001 

1 1 1 

1 1 1 

000 


s 

1  00s 

s 

1101s 

Os 

s 

1  1  00s 

s 

1  s 

1  1  Os 
s 

00s 

s 

s 

Os 

Os 

Os 

001  s 

Os 

111s 

1  s 

s 

1  1  s 

Os 

001  s 

1  1  s 

1  1  Os 

000s 

001  Os 

00s 


s 

Os 

1  000s 

01001001s 

101s 
110101s 
00001 01  1  s 
001 00001 1 Os 
s 

0101s 
111101s 
001 0001 01 s 

1  1  Os 

1 001 00s 

01001 0000s 

s 

10111s 

001011 000s 

001 0001 00s 
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160 
161 
162 
163 
164 
165 
166 
167 
168 
169 
170 
171 
172 
173 
174 
175 
176 
177 
178 
179 
180 
181 
182 
183 
184 
185 
186 
187 
188 
189 
190 
191 
192 
193 
194 
195 
196 
197 
198 
199 
200 
201 
202 
203 
204 
205 
206 
207 
208 
209 
210 
211 
212 
213 
214 
215 


5 

5 

5 

6 

6 

7 

7 

7 

8 

8 

8 

9 

9 

9 

10 

10 

11 

11 

11 

12 

12 

12 

13 

13 

14 

14 

14 

15 

15 

15 

15 

16 

16 

17 

18 

19 

20 

20 

21 

22 

23 

23 

24 

24 

25 

26 

27 

28 

29 

30 

31 

31 

32 

32 

33 

34 


8 

15 
15 
8 

13 
8 

12 
14 
8 

13 
16 
9 

15 
17 
8 

15 
8 

14 
17 
8 
13 
16 
9 

14 
9 

18 
17 
8 

13 
14 
15 
12 
16 
13 
14 
13 
13 
18 
14 
13 
12 
15 
14 
17 
13 
14 
14 
14 
13 
13 
13 
17 
14 
17 
15 
14 


1  01 
000 
000 
000 
1  00 
01  1 
1  00 
1  00 
000 
1  01 
1  01 
01  1 
001 
1  01 
1  00 
001 
1  01 
1  00 
1  01 
1  01 
001 
1  01 
001 
01  1 
01  1 
000 
1  01 
000 
01  1 
000 
001 
1  01 
001 
01  1 
000 
001 
001 
000 
1  01 
001 
000 
1  01 
01  1 
1  01 
1  01 
01  1 
1  01 
1  0  1 
1  01 
1  00 
1  01 
1  01 
1  01 
1  01 
01  1 
01  1 


1 1 1 

1  01 
1  01 
1  00 
1  1  0 
001 
1  1  0 
1  1  0 
1  01 
1  1  0 
1  01 
1  00 
01  1 
1  01 
1  1  0 
01  1 
000 
1  1  0 
1  01 
1  1  0 
01  1 
1  01 
001 
1  00 
1  00 
1  01 
1  01 
1  00 
001 
1  01 
01  1 
1  01 
01  1 
001 
1  01 
000 
001 
1  01 

1 1 1 

000 
1  01 
1  01 
001 
1  01 
1  01 
001 
1  1  0 
1  01 
1  01 
1  1  0 
1  1  0 
1  01 

1 1 1 

1  01 
001 
001 


0s 
1  00 
1  00 
1  s 
1  01 
1  s 

1 1 1 

1  01 
Os 
01  0 
001 
1  1  s 
01  0 
001 
Os 
01  0 
Os 
1  01 
001 
1  s 
01  0 
001 
1  1  s 
01  0 
00s 
1  00 
001 
Os 
01  1 

1 1 1 

01  0 
01  1 
01  0 
01  1 

1 1 1 

1  1  0 
1  01 
1  00 

1 1 1 
1 1 1 

1  00 
000 
01  1 
000 
01  1 
01  1 
01  1 
000 
000 
1  00 
01  1 
001 

1 1 1 

001 
01  1 
01  1 


1 0001 s 
10010s 

101s 

1  1  s 
1001s 

01  Os 
011011s 

0001 1 s 
001  1 000s 

0001 Os 

1110s 
0011011s 

001  s 
011010s 

0001  s 

11100111s 
00001 01 s 

1  1  Os 

1011s 

01110s 

1  1  s 

010101s 

1  00s 

1010s 

101s 

1  00s 

11100110s 

1011s 

111s 

01  s 

01001s 

001  Os 

0101001s 

000s 


1011 

001  1 

001  1 

000s 

011s 

000s 

001011 

1010s 

0010110s 

00110s 

1010s 


1  s 
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216 
217 
218 
219 
220 
221 
222 
223 
224 
225 
226 
227 
228 
229 
230 
231 
232 
233 
234 
235 
236 
237 
238 
239 
240 
241 
242 
243 
244 
245 
246 
247 
248 


35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 
46 
47 
48 
48 
49 
50 
51 
52 
53 
54 
54 
55 
55 
56 
56 
57 
58 
59 
60 
61 
62 
63 


14 

14 
14 
13 
13 
14 
13 
14 
14 
15 
14 
13 
13 
13 
17 
14 
15 
14 
15 
13 
14 
17 
12 
17 
12 
16 
13 
14 
15 
15 
14 
14 
14 


01  1 
001 
001 
1  01 
001 
001 
1  00 
1  00 
001 
000 
1  01 
1  01 
1  01 
001 
1  01 
1  00 
001 
1  01 
000 
1  00 
1  01 
1  01 
1  01 
1  01 
1  01 
001 
1  01 
000 
000 
000 
1  01 
1  00 
001 


1  00 
000 
001 
1  1  0 
001 
000 
1  1  0 
1  1  0 
000 
1  01 
1  01 
1  1  0 

1 1 1 

000 
1  01 
1  1  0 
01  1 

1 1 1 

1  01 

1  1  0 

1  01 

1  01 

1 1 1 

1  01 

1 1 1 

01  1 
1  1  0 
1  01 
1  01 
1  01 
1  01 
1  1  0 
000 


01  0 

1 1 1 

1  01 
01  1 
1  00 

1 1 1 

1  00 

1 1 1 
1 1 1 

1  00 
000 
01  0 
1  00 
1  1  0 
001 
1  01 
01  0 
1  00 
1  00 

1 1 1 

000 
001 
1  1  0 
001 

1 1 1 

01  0 
01  0 
1  00 
1  00 
1  00 
000 
1  01 

1 1 1 


01  0 
01  0 
1  01 

1 1 1 

1  01 
01  0 
01  0 
1  00 
01  1 
1  01 
001 
000 
1  01 
1  00 
001 
1  1  0 
000 
1  1  0 
1  01 
01  0 
1  00 
001 
1  1  s 
001 
01  s 
01  0 
01  1 

1 1 1 

1  01 
1  01 
1  00 

1 1 1 

01  1 


1  s 

1  s 

1  s 

s 

s 

0s 

s 

Os 

1  s 

01  s 

Os 

s 

s 

s 

1111s 

1  s 

01  s 

1  s 

00s 

s 

1  s 

1110s 

1001s 

1  1  Os 
s 

1  s 
1  1  s 
1  Os 
Os 
1  s 
Os 
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APPENDIX  B.  STATIC  SLIDE  VARIABLE  LENGTH  CODING  TABLE  (VLC) 

The  following  is  the  custom  VLC  table  used  with  static  slide  sequences.  The  last 
character  in  the  codeword  s  indicates  the  appended  sign  bit. 


INDEX 

LAST 

RUN 

LEVEL 

BITS 

CODEWORD 

1 

0 

0 

1 

4 

101s 

2 

0 

0 

2 

4 

1  1  Os 

3 

0 

0 

3 

5 

1  000s 

4 

0 

0 

4 

7 

1111 00s 

5 

0 

0 

5 

7 

001110s 

6 

0 

0 

6 

6 

00110s 

7 

0 

0 

7 

13 

011011010011s 

8 

0 

0 

8 

10 

011101011s 

9 

0 

0 

10 

11 

1111101011s 

10 

0 

0 

11 

11 

1 001 0001 00s 

11 

0 

0 

13 

11 

1001101 000s 

12 

0 

0 

20 

15 

001111011111 00s 

13 

0 

0 

24 

12 

001111011 00s 

14 

0 

0 

28 

11 

0110110101s 

15 

0 

0 

32 

9 

00001 1 00s 

16 

0 

0 

36 

9 

011111 00s 

17 

0 

0 

40 

8 

00001 00s 

18 

0 

0 

44 

8 

10011 00s 

19 

0 

0 

48 

6 

0001 1 s 

20 

0 

0 

52 

8 

0110011s 

21 

0 

0 

56 

8 

0010101s 

22 

0 

0 

60 

9 

001 0001 Os 

23 

0 

0 

64 

4 

01  Os 

24 

0 

1 

6 

1  1 1 00s 

25 

0 

2 

6 

0001 Os 

26 

0 

3 

8 

0111011s 

27 

0 

4 

10 

111111111s 

28 

0 

5 

10 

001111110s 

29 

0 

6 

9 

001 00001  s 

30 

0 

7 

16 

001111011110010s 

31 

0 

8 

12 

10010010111s 

32 

0 

10 

14 

0111101001 000s 

33 

0 

13 

15 

1111111 00001 01  s 

34 

0 

2 

1 

7 

0111 00s 

35 

0 

2 

2 

7 

011010s 

36 

0 

2 

3 

8 

0010011s 

37 

0 

2 

4 

11 

01111011 00s 

38 

0 

2 

5 

9 

10010011s 

39 

0 

2 

6 

9 

10011110s 

40 

0 

2 

7 

17 

0011110111111110s 

41 

0 

2 

8 

12 

00101001110s 

42 

0 

2 

10 

14 

1111111  000001 s 

43 

0 

2 

11 

15 

11111110011 000s 

44 

0 

2 

13 

14 

1111111 00001  1  s 

45 

0 

3 

1 

8 

0111100         s 

46 

0 

3 

2 

8 

01 1 0000s 

47 

0 

3 

3 

9 

00001101s 

69 


48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 

61 

62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 

81 

82 

83 

84 

85 

86 

87 

88 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 

100 

101 

102 

103 


0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 


3 

3 

3 

3 

3 

3 

3 

3 

4 

4 

4 

4 

4 

4 

4 

5 

5 

5 

5 

5 

5 

5 

6 

6 

6 

6 

6 

6 

6 

7 

7 

7 

7 

7 

7 

7 

7 

8 

8 

8 

8 

8 

8 

8 

8 

9 

9 

9 

9 

9 

9 

10 

10 

10 

10 

10 


4 
5 
6 
7 
8 
10 
11 
13 
1 
2 
3 
4 
5 
6 
8 
1 
2 
3 
4 
5 
6 
8 
1 
2 
3 
4 
5 
6 
10 
1 
2 
3 
4 
5 
6 
7 
10 
1 
2 
3 
4 
5 
6 
8 

10 
1 
2 
3 
4 
5 
6 
1 
2 
3 
4 
5 


11 
10 
10 
17 
13 
15 
15 
16 
8 
8 
9 
10 
11 
9 

15 
10 
9 

10 
12 
12 
11 
15 
9 
9 
10 
13 
11 
12 
15 
10 
10 
11 
15 
16 
13 
16 
15 
9 
9 
10 
11 
12 
10 
15 
15 
9 
9 
11 
15 
11 
13 
12 
11 
13 
15 
13 


00001 
001  00 
00111 
00111 
10010 
11111 
11111 
01101 
01100 
01111 
001  00 
10011 
00001 
001  00 

11111 

01111 
01101 
01111 
10011 
00111 

11111 
11111 
11111 

00001 
10011 

11111 

10010 
01111 

11111 

11110 
001  00 
10011 
01101 
001  00 
01110 
00111 

11111 

00101 
00101 
10011 
001  00 
10011 
01111 

11111 
11111 

11110 
10010 
00101 
00111 
11110 

11111 

00111 
00100 
001  00 
00111 
00111 


1  1  0 
1  01 
1  1  0 
1  01 
001 
1  1  0 
1  1  0 
1  01 
01  s 
1  1  s 
1  00 
01  0 
1  1  0 
01  1 
1  1  0 
1  01 
1  00 
01  0 

1 1 1 

1  00 
01  0 
1  1  0 
01  1 
01  0 
01  1 
1  1  0 
01  0 
01  1 
1  1  0 
1  00 
000 
01  0 
1  01 
1  01 
1  01 
1  01 
1  1  0 
000 

1 1 1 

01  1 

1  01 

1 1 1 

01  1 
1  1  0 
1  1  0 
1  01 
000 
001 
1  01 
1  00 
1  1  0 
1  01 
000 
000 
1  01 
1  01 


00s 
0s 
1  s 

1 1 1 

01  1 
01  1 
01  1 
001 


s 

1  s 
01  s 
s 

01  0 
Os 
s 

1  s 
001 
01  0 
00s 
01  0 
s 
s 
1  s 

1 1 1 

00s 
01  1 
01  1 

0  s 
Os 

01  s 
001 
1  00 
000 
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APPENDIX  C.  MATLAB  CODE  LIBRARY 

This  appendix  contains  the  Matlab  code  used  in  the  layered  video  codec.  The 
main  code  block,  thesis,  m,  is  provided  first,  and  the  supporting  functions  follow  in 
alphabetical  order.  As  provided,  thesis,  m  processes  four  video  sequences:  two  motion 
video  sequences  of  100  frames  each,  followed  by  two,  static  slide  sequences  of  50  frames 
each. 


%  thesis 

format  compact 
clear  all 
close  all 


global  QD_TABLE  VLC_DYN  VLC_STA  RV  HUFFJTABLE  LAST 

load ( ' g : \QD_TABLE ' ) 
load ('g:\VlC_dyn' ) 
load{ 'g: \VLC_sta' ) 


VLC_DYN  =  VLC_dyn;  clear  VLC_dyn 
VLC_STA  =  VLC_sta;  clear  VLC_sta 
slides 

rand( ' state' , 0) 

RV  =  floor (21*rand (99, 1) ) ; 

HUFFJTABLE  =  make_HAC_table; 

LAST  =  ones (99, 1) ; 

threshold  =  160; 

count  =  -1; 

mse  =  1; 

display  ■  1; 

write  =  1; 

last  =  299; 

tOff  =  0; 

frame_type  =  0; 

m_mat_ndx  =  [ ] ; 

selected) 

f_vec  =  zeros (1584,  16)  ; 

f_far_ll  =  doubleUeros  (792,  8)  ) 

f_far_lh  =  doubleUeros  (792,  8)  ) 

f_far_hl  =  doubleUeros  (792,  8)  ) 

f  far  hh  =  double (zeros  (792, 8 )  ) 


f_far_ll_pl 
f_far_lh_pl 
f_far_hl_pl 
f_far_hh_pl 

f_far_ll_p2 
f_far_lh_p2 
f_far_hl_p2 
f  far  hh  p2 


double (zeros  (792, 
double (zeros (792, 
double (zeros (792, 
double (zeros  (792, 

double (zeros  (792, 
double (zeros (792, 
double (zeros (792, 
double  (zeros (792, 


f_l  =  double(zeros(1584,16) 
f_2  =  doubleUeros  (1584,  16) 
f  3  =  doubleUeros  (1584,16) 


%  gets  custom  VLC  table  for  motion  video 
%  gets  custom  VLC  table  for  static 


%  seed  for  reproductivity 

%  for  aging  algorithm 

%  gets  JPEG  standard  table 

%  sets  block  selection  to  1  initially 

%  for  use  with  asd 

%  where  is  it  in  the  loop? 

%  calculate  MSE? 

%  show  the  images? 

%  write  to  file  for  evaluation? 

%  last  frame  #  to  consider 

%  intialize  (used  with  scene  change) 

%  initialize  (zero  means  dynamic  scene) 

%  initialize  (track  macroblocks 

%  initialize  (reshaped  frame) 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 

%  initialize  for  decoder 
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beta  =  -11.346; 


%  slope  of  rate  control  curve 


br  =  input ( 'Enter  the  target  bitrate  (Kbps)  \n  >>  ') 

B_bar  =  br  *  1024/10; 

qd_entry  =  min(f ind(B_bar  >  QDJTABLE ( : , 4 ) ) ) ;     %  intial  triplet  based  on  test  averages 


%  loop  through  frames 

%  read  head# . bmp 

%  read  ncaa# . bmp 

%  read  busy  text  slide 

%  read  line  drawing  slide 

%  diplays  original  if  desired 


for  i  =  0:last 

count  =  count+1 

if  (i<=99) 

i_present  =  get_next_image ( i ) ; 
elseif  (  (i>=100)  &  (i<=199)  ) 

i_present  =  get_next_image_2 (i-100) ; 
elseif  (i==200) 

i_present  =  get_next_image (101)  ; 
elseif  (i==250) 

i_present  =  get_next_image (102)  ; 
else 

i_present; 
end 

if  display 

figure ; 

subplot (2, 2,1) 

image ( i_present ) 

colormap(gray(256) ) 

title ( [ 'Original' ] ) 

axis  off 
end 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Coder  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%  Change  dimensions  of  image  and  identify  macroblocks  by  threshold  %%%%%%%% 

f_last  =  f_vec;  %  buffers  previous  frame 

f_vec  =  shape ( i_present ) ;  %  shapes  current  frame 

[m_vec_ndx, f rame_type]  =  m_blk_id_xr (f_last, f_vec, threshold) ;  %  compare  current  to 
last 


m_mat_ndx  =  [m_mat_ndx,m_vec_ndx] 

if  (sum(m_vec_ndx)  >=  65) 

tOff  =  1; 
end 


%  matrix  of  MB's  selected 

%  triggers  a  scene  change 
%  "trigger"  flag  set  "on" 


%  scene  change  just  occured 

%  resets  "trigger"  flag 

%  "flag"  for  1st  frame  after  scene  change 

%  heavily  compressed  scene  change 


if  (tOff) 

tOff  =  0; 

flag  =  1; 

ql  =  64; 

q2  =  1000; 

q3  =  1000; 
elseif  frame_type 

ql  =  4; 

q2  =  16; 

q3  =  16; 
else  %  a  dynamic  frame  sequence 

delta_Binter  =  B_bar  -  (sum(m_vec_ndx) /sum(m_iriat_ndx( : ,  i)  )  )  *  total(i); 

qd_entry  =  get_qd_entryf (f lag, B_bar, qd_entry, delta_Binter , sum(m_vec_ndx) ,beta) ; 

flag  =0;  %  resets  flag 


%  triplet  for  static  sequence 


ql  =  QDJTABLE ( qd_ent  ry , 1 ) 
q2  =  QDJTABLE (qd_entry,  2) 
q3  =  QDJTABLE (qd_entry, 3) 
end 

Q  =  get_Ql_matrix(ql) ; 

if  -frame_type 


%  rate  control  triplets  fetched 

%  quantizer  matrix  (via  JPEG  standard) 
%  if  dynamic  frame  sequence 
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%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Transforms  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

[f_fht_ll, f_fht_lh, f _f ht_hl , f _f ht_hh  ]  =  fht ( f_vec,m_vec_ndx, 16) ;   %  FHT  of  frame 

f_fht_ll  =  f_fht_ll  -  128;  %  level  shift  of  LL 

f_fht_ll  =  dct_of_fht(f_fht_ll,m_vec_ndx) ;  %  2_D  DCT  of  LL 

[f_fht_lh_ll, f_fht_lh_lh, f_fht_lh_hl, f_fht_lh_hh]  =  . . . 

fht(f_fht_lh,m_vec_ndx, 8) ;  %  subband  LH  FHT 

[ f _f ht_hl_ll , f_f ht_hl_lh, f _f ht_hl_hl , f_f ht_hl_hh]  =  . . . 

fht  (f_fht_hl/m_vec_ndx,  8)  ,-  %  subband  HL  FHT 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Quantizing  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

f_fht_ll_q  =  quantizer_ll (f_fht_ll,Q,m_vec_ndx) ;        %  quantize  LL 

f_fht_hh_q  =  round ( f_fht_hh/q3 ) ;  %  quantize  HH 

f_fht_lh_ll_q  =  round ( f_fht_lh_ll /q2 )  ,•  %  quantize  of  LH  subbands 

f_fht_lh_lh_q  =  round (f_fht_lh_lh/q2) 
f_fht_lh_hl_q  =  round (f_fht_lh_hl/q3) 
f_fht_lh_hh_q  =  round (f_fht_lh_hh/q3) 


%  quantize  of  HL  subbands 


f_fht_hl_ll_q  =  round (f_fht_hl_ll/q2) 

f_fht_hl_lh_q  =  round (f_fht_hl_lh/q3) 

f_fht_hl_hl_q  =  round (f_fht_hl_hl/q2) 

f_fht_hl_hh_q  =  round (f_fht_hl_hh/q3) 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  LL  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  zig-zag  scans  each  LL  8x8  (Results  are  one  6336x1  vector  [ll_zz]  and  the  index 
%  of  the  last  non-zero  entity  [last_ll_zz] .   Get  "inf"  if  a  group  of 

%  64  is  all  zeros. ) 

[last_ll_zz,ll_zz]  =  zzb(f_fht_ll_q,8) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size.) 
a_ll_zz  =  make_it_compact (ll_zz, last_ll_zz, 8) ; 

%  parsing  LL  with  Huffman  routine 
parsed_ll_zz  =  parse_Huf f (a_ll_zz, last_ll_zz) ; 

%  gets  bits  per  frame  due  to  LL 

bits_ll_zz (i+1)  =  get_bits_Huf f (parsed_ll_zz) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  HH  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HH  8x8  (Results  are  one  6336x1  vector  [hh_r]  and  the  index  of  the  last 
%  non-zero  entity  [last_hh_r] ,  where  r  indicates  horizontal  raster 

%  the  scan  method.   Get  "inf"  if  a  group  of  64  is  all  zeros.) 

[last_hh_r,hh_r]  =  raster (f_fht_hh_q, 8) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh_r  =  make_it_compact  (hh_r,  last_hh_r,  8)  ,- 

%  parsing  HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hh_r_3D  =  parse_3D(a_hh_r, last_hh_r) ; 
l_P_hh_r_3D ( i+1 )  =  length (par sed_hh_r_3D ( : , 1 ) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hh_r_3D(i+l) ,  hh_r_22 (i+1) ]  =  get_bits2 (parsed_hh_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  LH  subbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
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%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_LL  4x4  (Results  are  one  1584x1  vector  [lh_ll_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_ll_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_lh_ll_r,lh_ll_r]  =  raster (f_fht_lh_ll_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_ll_r  =  make_it_compact (lh_ll_r , last_lh_ll_r,  4)  ; 

%  parsing  LH_LL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_ll_r_3D  =  parse_3D(a_lh_ll_r, last_lh_ll_r)  ; 
l_p_lh_ll_r_3D(i+l)  =  length(parsed_lh_ll_r_3D(:,l)  )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_lh_ll_r_3D(i+U ,  lh_ll_r_22 (i+1) ]  =  get_bits2 (parsed_lh_ll_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_LH  4x4  (Results  are  one  1584x1  vector  [lh_lh_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_lh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_lh_lh_r,lh_lh_r]  =  raster ( f_fht_lh_lh_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_lh_r  =  make_it_compact ( lh_lh_r , last_lh_lh_r ,  4 )  ; 

%  parsing  LH_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_lh_r_3D  =  parse_3D(a_lh_lh_r, last_lh_lh_r)  ; 
l_p_lh_lh_r_3D(i+l)  =  length (parsed_lh_lh_r_3D ( : ,  1) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_lh_lh_r_3D(i+D ,  lh_lh_r_22 (i+1) ]  =  get_bits2 (parsed_lh_lh_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_HL  4x4  (Results  are  1584x1  vector  [lh_hl_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_hl_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[  last_lh_hl_r ,  lh_hl_r  ]  =  raster  ( f_fht_lh_hl_q,  4 )  ,- 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_hl_r  =  make_it_compact ( lh_hl_r , last_lh_hl_r ,  4 )  ; 

%  parsing  LH_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_hl_r_3D  =  parse_3D(a_lh_hl_r, last_lh_hl_r) ; 
l_p_lh_hl_r_3D ( i+1 )  =  length (parsed_lh_hl_r_3D ( : ,  1 ) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_lh_hl_r_3D(i+l) ,  lh_hl_r_22 (i+1) ]  =  get_bits2 (parsed_lh_hl_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_HH  4x4  (Results  are  1584x1  vector  [lh_hh_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_hh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_lh_hh_r,lh_hh_r]  =  raster (f_fht_lh_hh_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
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a_lh_hh_r  =  make_it_compact  (lh_hh_r,  last_lh_hh_r,  4)  ,- 

%  parsing  LH_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_hh_r_3D  =  parse_3D(a_lh_hh_r, last_lh_hh_r) ; 
l_p_lh_hh_r_3D(i+l)  =  length ( par sed_lh_hh_r_3D( : , 1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_lh_hh_r_3D(i+l) ,  lh_hh_r_22 (i+1) ]  =  get_bits2 (parsed_lh_hh_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  HL  subbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_LL  4x4  (Results  are  a  1584x1  vector  [hl_ll_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_ll_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_hl_ll_v,hl_ll_v]  =  vertical (f_fht_hl_ll_q, 4 ) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_ll_v  =  make_it_compact (hl_ll_v, last_hl_ll_v, 4) ; 

%  parsing  HL_LL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_ll_v_3D  =  parse_3D(a_hl_ll_v, last_hl_ll_v) ; 
l_p_hl_ll_v_3D(i+l)  =  length (parsed_hl_ll_v_3D(: ,1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hl_ll_v_3D(i+D ,  hl_ll_v_22 (i+1) ]  =  get_bits2 (parsed_hl_ll_v_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_LH  4x4  (Results  are  a  1584x1  vector  [hl_lh_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_lh_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_hl_lh_v,hl_lh_v]  =  vertical (f_fht_hl_lh_q, 4 ) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_lh_v  =  make_it_compact  (hl_lh_v,  last_hl_lh_v,  4)  ,- 

%  parsing  HL_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_lh_v_3D  =  parse_3D(a_hl_lh_v, last_hl_lh_v) ; 
l_p_hl_lh_v_3D(i+l)  =  length (parsed_hl_lh_v_3D(: ,1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hl_lh_v_3D(i+l)  ,  hl_lh_v_22 (i+1) ]  =  get_bits2 (parsed_hl_lh_y_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_HL  4x4  (Results  are  a  1584x1  vector  [hl_hl_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_hl_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_hl_hl_v,hl_hl_v]  =  vertical (f_fht_hl_hl_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_hl_v  =  make_it_compact (hl_hl_v, last_hl_hl_v, 4) ; 

%  parsing  HL_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_hl_v_3D  =  parse_3D(a_hl_hl_v,  last_hl_hl_v)  ,- 
l_p_hl_hl_v_3D  ( i+1 )  =  length  (parsed_hl_hl_v_3D  (  :  ,  1 )  )  ,- 
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%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hl_hl_v_3D(i+l) ,  hl_hl_v_22 (i+1) ]  =  get_bits2 (parsed_hl_hl_v_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_HL  4x4  (Results  are  a  1584x1  vector  [hl_hh_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_hh_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_hl_hh_v,hl_hh_v]  =  vertical  ( f _f ht_hl_hh_q ,  4 )  ,- 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_hh_v  =  make_it_compact (hl_hh_v, last_hl_hh_v, 4 ) ; 

%  parsing  HL_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_hh_v_3D  =  parse_3D(a_hl_hh_v, last_hl_hh_v) ; 
l_p_hl_hh_v_3D ( i+1 )  =  length (parsed_hl_hh_v_3D ( :  ,  1 ) )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hl_hh_v_3D(i+D ,  hl_hh_v_22 (i+1) ]  =  get_bits2 (parsed_hl_hh_v_3D, frame_type) ; 

%  getting  bits  per  frame  per  layer 

layerl(i+l)  =  bits_ll_zz (i+1) ; 

layer2(i+l)  =  bits_lh_ll_r_3D(i+l)  +  bits_lh_lh_r_3D(i+l)  +  ... 

bits_hl_ll_v_3D(i  +  l)  +  bits_hl_hl_v_3D(i+l) ; 
layer3(i+l)  =  bits_hh_r_3D(i+l)  +  bits_lh_hl_r_3D(i+l)  +  bits_lh_hh_r_3D(i+l)  +  ... 

bits_hl_lh_v_3D(i  +  l)  +  l_p_hl_hh_v_3D(i  +  l)  ; 
total (i+1)  =  layerl(i+l)  +  layer2(i+l)  +  layer3 (i+1) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%  % 

%  Channel  % 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Unquantize  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

f_fht_ll_uq  =  unquantize_ll (f_fht_ll_q,Q,m_vec_ndx) ;  %  unquantize  LL 

f_fht_hh_uq  =  f_fht_hh_q  *  q3 ;  %  unquantize  HH 

f_fht_lh_ll_uq  =  f_fht_lh_ll_q  *  q2 ;  %  unquantize  of  LH  subbands 

f_fht_lh_lh_uq  =  f_fht_lh_lh_q  *  q2 

f_fht_lh_hl_uq  =  f_fht_lh_hl_q  *  q3 

f_fht_lh_hh_uq  =  f_fht_lh_hh_q  *  q3 


%  unquantize  of  HL  subbands 


f_fht_hl_ll_uq  =  f_fht_hl_ll_q  *  q2 

f_fht_hl_lh_uq  =  f_fht_hl_lh_q  *  q3 

f_fht_hl_hl_uq  =  f_fht_hl_hl_q  *  q2 

f_fht_hl_hh_uq  =  f_fht_hl_hh_q  *  q3 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Inverse  Transform  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

f_far_ll  =  invdct_of_fht(f_fht_ll_uq,m_vec_ndx) ;    %  inv  2-D  dct  of  LL 

f_far_ll  =  f_far_ll  +  128;  %  level  shift 

f_far_lh  =  remake_3(f_far_lh,f_fht_lh_ll_uq,f_fht_lh_lh_uq,f_fht_lh_hl_uq,  ... 

f_fht_lh_hh_uq,m_vec_ndx, 8) ;  %  LH  subband  inv  FHT 

f_far_hl  =  remake_3(f_far_hl,f_fht_hl_ll_uq, f_fht_hl_lh_uq,f_fht_hl_hl_uq,  ... 

f_fht_hl_hh_uq/m_vec_ndx, 8) ;  %  HL  subband  inv  FHT 

f_far_hh  =  f_fht_hh_uq; 

f_far_ll_p2  =  f_far_ll;  %  LL  p2  assignment 

f_far_lh_p2  =  remake_3 (f_far_lh_p2, f_fht_lh_ll_uq, f_fht_lh_lh_uq, . . . 

0,0, m_vec_ndx , 8 ) ;  %  LH  p2  subband  inv  FHT 

f_f ar_hl_p2  =  remake_3 ( f _f ar_hl_p2 , f _f ht_hl_ll_uq, 0 , . . . 

f_fht_hl_hl_uq,0,m_vec_ndx,8) ;  %  HL  p2  subband  inv  FHT 
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f_far_hh_p2 


0; 


%  HH  p2  assignment 


f_far_ll_pl 
f_far_lh_pl 
f_far_hl_pl 
f_far_hh_pl 


f_far_ll; 

0; 

0 

0 


%  LL  pi  assignment 

%  HH  pi  assignment 

%  HH  pi  assignment 

%  HH  pi  assignment 


f_3  =  remake_3 ( f_3 , f_f ar_ll , f _f ar_lh, f_f ar_hl , . . . 

f_far_hh,m_vec_ndx, 16) ;  %  frame  inv  FHT  with  3  layers 

f_2  =  remake_3 ( f_2 , f_f ar_ll_p2 , f_f ar_lh_p2 , f_f ar_hl_p2 , . . . 

f_far_hh_p2,m_vec_ndx, 16) ;  %  frame  inv  FHT  with  2  layers 

f_l  =  remake_3 (f_l, f_far_ll_pl, f_far_lh_pl, f_far_hl_pl, . . . 

f_far_hh_pl,m_vec_ndx, 16) ;  %  frame  inv  FHT  with  1  layer 

else  %  a  staic  frame  sequence 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Transforms  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
[f_fht_ll, f_fht_lh, f_fht_hl, f_fht_hh  ]  =  fht (f_vec,m_vec_ndx, 16) ;   %  FHT  of  frame 


[f_fht_ll_ll, f_fht_ll_lh, f_fht_ll_hl, f_fht_ll_hh]  = 
f ht ( f _f ht_l 1 , m_vec_ndx ,  8 )  ; 

[f_fht_lh_ll, f_fht_lh_lh, f_fht_lh_hl, f_fht_lh_hh]  = 
fht ( f _f ht_lh, m_vec_ndx, 8 ) ; 

[f_fht_hl_ll, f_fht_hl_lh, f_fht_hl_hl, f_fht_hl_hh]  = 
fht ( f _f ht_hl , m_vec_ndx, 8 ) ; 

[  f  _f ht_hh_ll , f _f ht_hh_lh, f _f ht_hh_hl , f_f ht_hh_hh]  = 
fht ( f _f ht_hh, m_vec_ndx, 8 ) ; 


%  subband  LL  FHT 


%  subband  LH  FHT 


%  subband  HL  FHT 


%  subband  HH  FHT 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Quantizing  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

f_fht_ll_ll_q  =  round ( f_fht_l 1_1 1 / gl ) ;  %  quantize  of  LL  subbands 

f_fht_ll_lh_q  =  round (f_fht_ll_lh/q2) 
f_fht_ll_hl_q  =  round (f_fht_ll_hl/q2) 
f_fht_ll_hh_q  =  round (f_fht_ll_hh/q3) 


f_fht_lh_ll_q  =  round (f_fht_lh_ll/q2) 
f_fht_lh_lh_q  =  round (f_fht_lh_lh/q2) 
f_fht_lh_hl_q  =  round (f_fht_lh_hl/q3) 
f_fht_lh_hh_q  =  round (f_fht_lh_hh/q3) 

f_fht_hl_ll_q  =  round (f_fht_hl_ll/q2) 
f_fht_hl_lh_q  =  round (f_fht_hl_lh/q3) 
f_fht_hl_hl_q  =  round (f_fht_hl_hl/q2) 
f_fht_hl_hh_q  =  round ( f_fht_hl_hh/q3) 

f_fht_hh_ll_q  =  round (f_fht_hh_ll/q3) 
f_fht_hh_lh_q  =  round (f_fht_hh_lh/q3) 
f_fht_hh_hl_q  =  round (f_fht_hh_hl/q3) 
f_fht_hh_hh_q  =  round (f_fht_hh_hh/q3) 


%  quantize  of  LH  subbands 


%  quantize  of  HL  subbands 


%  quantize  of  HL  subbands 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  LL  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LL_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LL_LL  4x4  (Results  are  one  1584x1  vector  [ll_ll_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_ll_ll_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf "  if  a  group  of  16  is  all 

%  zeros.) 

[last_ll_ll_r,ll_ll_r]  =  raster (f_fht_ll_ll_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_ll_ll_r  =  make_it_compact (ll_ll_r, last_ll_ll_r,4) ; 
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%  parsing  LL_LL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_ll_ll_r_3D  =  parse_3D(a_ll_ll_r, last_ll_ll_r) ; 
l_p_ll_ll_r_3D(i+l)  =  length (parsed_ll_ll_r_3D ( :,1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_ll_ll_r_3D(i+D ,  ll_ll_r_22 (i+1) ]  =  get_bits2 (parsed_ll_ll_r_3D/ frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LL_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LL_LH  4x4  (Results  are  one  1584x1  vector  [ll_lh_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_ll_lh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros. ) 

[last_ll_lh_r,ll_lh_r]  =  raster  (f_fht_ll_lh_q,  4)  ,- 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_ll_lh_r  =  make_it_compact (ll_lh_r, last_ll_lh_r, 4) ; 

%  parsing  LL_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_ll_lh_r_3D  =  parse_3D(a_ll_lh_r, last_ll_lh_r) ; 
l_p_ll_lh_r_3D(i+l)  =  length ( par sed_ll_lh_r_3D( : , 1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_ll_lh_r_3D(i+D ,  ll_lh_r_22 (i+1) ]  =  get_bits2 (parsed_ll_lh_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LL_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LL_HL  4x4  (Results  are  1584x1  vector  [ll_hl_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_ll_hl_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_ll_hl_r,ll_hl_r]  =  raster (f_fht_ll_hl_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_ll_hl_r  =  make_it_compact  (ll_hl_r,  last_ll_hl_r,  4)  ,- 

%  parsing  LL_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_ll_hl_r_3D  =  parse_3D(a_ll_hl_r,  last_ll_hl_r) ; 
l_p_ll_hl_r_3D(i+l)  =  length (parsed_ll_hl_r_3D(: ,1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_ll_hl_r_3D(i+D ,  ll_hl_r_22 (i+1) ]  =  get_bits2 (parsed_ll_hl_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LL_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LL_HH  4x4  (Results  are  1584x1  vector  [ll_hh_r]  and  the  index  of 

%  the  last  non-zero  entity  [last_ll_hh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_ll_hh_r#H_hh_r]  =  raster (f_fht_ll_hh_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_ll_hh_r  =  make_it_compact (ll_hh_r, last_ll_hh_r, 4) ; 

%  parsing  LL_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_ll_hh_r_3D  =  parse_3D(a_ll_hh_r, last_ll_hh_r) ; 
l_p_ll_hh_r_3D(i+l)  =  length (parsed_ll_hh_r_3D (.-,  1)  )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_ll_hh_r_3D(i+l) ,  ll_hh_r_22 (i+1) ]  =  get_bits2 (parsed_ll_hh_r_3D, frame_type) ; 
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%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  LH  subbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_LL  4x4  (Results  are  one  1584x1  vector  [lh_ll_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_ll_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_lh_ll_r,lh_ll_r]  =  raster  (f_fht_lh_ll_q,  4)  ,- 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_ll_r  =  make_it_compact  (lh_ll_r,  last_lh_ll_r,  4)  ,- 

%  parsing  LH_LL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_ll_r_3D  =  parse_3D(a_lh_ll_r, last_lh_ll_r) ; 
l_p_lh_ll_r_3D(i+l)  =  length (parsed_lh_ll_r_3D( : , 1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_lh_ll_r_3D(i+U ,  lh_ll_r_22 (i+1) ]  =  get_bits2 (parsed_lh_ll_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_LH  4x4  (Results  are  one  1584x1  vector  [lh_lh_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_lh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_lh_lh_r,lh_lh_r]  =  raster (f_fht_lh_lh_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_lh_r  =  make_it_compact (lh_lh_r, last_lh_lh_r, 4) ; 

%  parsing  LH_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_lh_r_3D  =  parse_3D(a_lh_lh_r,  last_lh_lh_r)  ,- 
l_p_lh_lh_r_3D ( i+1 )  =  length (parsed_lh_lh_r_3D ( : , 1 ) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_lh_lh_r_3D(i+D ,  lh_lh_r_22 (i  +  1) ]  =  get_bits2 (parsed_lh_lh_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_HL  4x4  (Results  are  1584x1  vector  [lh_hl_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_hl_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_lh_hl_r,lh_hl_r]  =  raster (f_fht_lh_hl_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_hl_r  =  make_i  t_compact ( lh_hl_r , las  t_lh_hl_r , 4 ) ; 

%  parsing  LH_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_hl_r_3D  =  parse_3D(a_lh_hl_r, last_lh_hl_r) ; 
l_p_lh_hl_r_3D(i+l)  =  length (parsed_lh_hl_r_3D( : , 1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_lh_hl_r_3D(i+D ,  lh_hl_r_22 (i+1) ]  =  get_bits2 (parsed_lh_hl_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  LH_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  LH_HH  4x4  (Results  are  1584x1  vector  [lh_hh_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_lh_hh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 
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%  zeros.) 

[last_lh_hh_r,lh_hh_r]  =  raster (f_fht_lh_hh_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_lh_hh_r  =  make_it_compact (lh_hh_r , last_lh_hh_r, 4) ; 

%  parsing  LH_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_lh_hh_r_3D  =  parse_3D(a_lh_hh_r , last_lh_hh_r) ; 
l_p_lh_hh_r_3D ( i+1 )  =  length (parsed_lh_hh_r_3D ( : , 1 ) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_lh_hh_r_3D(i+D ,  lh_hh_r_22 (i+1) ]  =  get_bits2 (parsed_lh_hh_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  HL  subbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_LL  4x4  (Results  are  a  1584x1  vector  [hl_ll_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_ll_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_hl_ll_v,hl_ll_v]  =  vertical (f_fht_hl_ll_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_ll_v  =  make_it_compact (hl_ll_v, last_hl_ll_v,  4)  ; 

%  parsing  HL_LL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_ll_v_3D  =  parse_3D(a_hl_ll_v, last_hl_ll_v)  ; 
l_p_hl_ll_v_3D(i+l)  =  length (parsed_hl_ll_v_3D ( : , 1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 

[bits_hl_ll_v_3D(i+l) ,  hl_ll_v_22 (i+1) ]  =  get_bits2 (parsed_hl_ll_v_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_LH  4x4  (Results  are  a  1584x1  vector  [hl_lh_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_lh_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_hl_lh_v,hl_lh_v]  =  vertical (f_fht_hl_lh_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_lh_v  =  make_it_compact (hl_lh_v, last_hl_lh_v,4) ; 

%  parsing  HL_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_lh_v_3D  =  parse_3D(a_hl_lh_v, last_hl_lh_v)  ; 
l_p_hl_lh_v_3D(i+l)  =  length (parsed_hl_lh_v_3D ( : , 1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hl_lh_v_3D(i+D ,  hl_lh_v_22 (i+1) ]  =  get_bits2 (parsed_hl_lh_v_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_HL  4x4  (Results  are  a  1584x1  vector  [hl_hl_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_hl_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_hl_hl_v,hl_hl_v]  =  vertical (f_fht_hl_hl_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_hl_v  =  make_it_compact (hl_hl_v, last_hl_hl_v, 4) ; 
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%  parsing  HL_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_hl_v_3D  =  parse_3D(a_hl_hl_v, last_hl_hl_v) ; 
l_p_hl_hl_v_3D(i+l)  =  length (parsed_hl_hl_v_3D ( :  ,  1)  )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hl_hl_v_3D(i+D ,  hl_hl_v_22 (i+1) ]  =  get_bits2 (parsed_hl_hl_v_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HL_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HL_HL  4x4  (Results  are  a  1584x1  vector  [hl_hh_v]  and  the  index  of 
%  the  last  non-zero  entity  [last_hl_hh_v] ,  where  the  v  indicates 

%  vertical  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[last_hl_hh_v,hl_hh_v]  =  vertical  (f_fht_hl_hh_q,  4)  ,- 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hl_hh_v  =  make_it_compact (hl_hh_v, last_hl_hh_v, 4)  ; 

%  parsing  HL_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hl_hh_v_3D  =  parse_3D(a_hl_hh_v, last_hl_hh_v) ; 
l_p_hl_hh_v_3D ( i+1 )  =  length (parsed_hl_hh_v_3D ( : , 1 ) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hl_hh_v_3D(i  +  D  ,  hl_hh_v_22  (i  +  1)  ]  =  get_bits2  (parsed_hl_hh_v_3D,  frame_type)  ,- 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Working  on  HH  subbands  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HH_LL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HH_LL  4x4  (Results  are  one  1584x1  vector  [hh_ll_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_hh_ll_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf  if  a  group  of  16  is  all 

%  zeros.) 

[last_hh_ll_r,hh_ll_r]  =  raster (f_fht_hh_ll_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh_ll_r  =  make_it_compact (hh_ll_r, last_hh_ll_r,  4)  ; 

%  parsing  HH_LL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hh_ll_r_3D  =  parse_3D(a_hh_ll_r, last_hh_ll_r) ; 
l_P_hh_ll_r_3D(i  +  l)  =  length (parsed_hh_ll_r_3D ( : ,  1)  )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hh_ll_r_3D(i+l) ,  hh_ll_r_22 (i+1) ]  =  get_bits2 (parsed_hh_ll_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HH_LH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HH_LH  4x4  (Results  are  one  1584x1  vector  [hh_lh_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_hh_lh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf  if  a  group  of  16  is  all 

%  zeros.) 

[last_hh_lh_r,hh_lh_r]  =  raster (f_fht_hh_lh_q, 4) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh_lh_r  =  make_it_compact (hh_lh_r, last_hh_lh_r, 4) ; 

%  parsing  HH_LH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hh_lh_r_3D  =  parse_3D(a_hh_lh_r,  last_hh_lh_r)  ,- 
l_P_hh_lh_r_3D(i  +  l)  =  length (parsed_hh_lh_r_3D( : ,  1)  )  ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 
%  (includes  99  COD  bits) 
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[bits_hh_lh_r_3D(i+D ,  hh_lh_r_22 (i+1) ]  =  get_bits2 (parsed_hh_lh_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HH_HL  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HH_HL  4x4  (Results  are  1584x1  vector  [hh_hl_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_hh_hl_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf "  if  a  group  of  16  is  all 

%  zeros.) 

[ last_hh_hl_r , hh_hl_r ]  =  raster ( f_f ht_hh_hl_q, 4 ) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh_hl_r  =  make_it_compact (hh_hl_r, last_hh_hl_r, 4) ; 

%  parsing  HH_HL  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hh_hl_r_3D  =  parse_3D(a_hh_hl_r, last_hh_hl_r) ; 
l_p_hh_hl_r_3D(i+l)  =  length ( par sed_hh_hl_r_3D ( : , 1) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hh_hl_r_3D(i+D ,  hh_hl_r_22 (i+1) ]  =  get_bits2 (parsed_hh_hl_r_3D, frame_type) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%  Doing  the  HH_HH  subband  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%  scans  each  HH_HH  4x4  (Results  are  1584x1  vector  [hh_hh_r]  and  the  index  of 
%  the  last  non-zero  entity  [last_hh_hh_r] ,  where  the  r  indicates 

%  horizontal  raster  scan  method.   Get  "inf"  if  a  group  of  16  is  all 

%  zeros.) 

[ last_hh_hh_r , hh_hh_r ]  =  raster ( f  _f  ht_hh_hh_q, 4 ) ; 

%  gets  rid  of  trailing  zeros  (one  big  vector  of  varying  size) 
a_hh_hh_r  =  make_it_compact (hh_hh_r , last_hh_hh_r , 4 ) ; 

%  parsing  HH_HH  and  getting  the  number  of  parsings  for  later  used  in  eval_thesis  to 
%  get  the  percent  of  time  that  the  default  bit  number  is  used. 

parsed_hh_hh_r_3D  =  parse_3D(a_hh_hh_r, last_hh_hh_r) ; 
l_P_hh_hh_r_3D ( i+1 )  =  length (parsed_hh_hh_r_3  D ( : , 1 ) ) ; 

%  gets  bits  per  frame  and  number  of  times  the  default  is  chosen 

%  (includes  99  COD  bits) 

[bits_hh_hh_r_3D(i+l) ,  hh_hh_r_2 2 ( i  + 1 ) ]  =  get_bits2 (parsed_hh_hh_r_3D, frame_type)  ; 

%  getting  bits  per  frame  per  layer  and  total 

layerl(i+l)  =  bits_ll_ll_r_3D(i+l)  +  bits_ll_lh_r_3D(i+l)  +  ... 

bits_lh_ll_r_3D(i+l)+  bits_lh_lh_r_3D(i+l)  +  ... 

bits_ll_hl_r_3D(i+l)+  bits_hl_ll_v_3D(i+l)  +  ... 

bits_hh_ll_r_3D(i+l)+  bits_hl_hl_v_3D(i+l)  +  ... 

bits_hh_hh_r_3D(i+l) ; 
layer2(i+l)  =  bits_ll_hh_r_3D(i+l)  +  bits_lh_hl_r_3D(i+l)  +  ... 

bits_lh_hh_r_3D(i+l)  +  bits_hl_lh_v_3D(i+l)  +  ... 

bit  s_hl_hh_v_3  D ( i + 1 ) ; 
layer3(i+l)  =  bits_hh_lh_r_3D(i+l)  +  bits_hh_hl_r_3D(i+l) ; 
total (i+1)  =  layerl(i+l)  +  layer2(i+l)  +  layer3 (i+1) ; 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%  % 

%  Channel  % 

%  % 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Unquantize  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

f_fht_ll_ll_uq  =  f_fht_ll_ll_q  *  ql;  %  unquantize  of  LH  subbands 

f_fht_ll_lh_uq  =  f_fht_ll_lh_q  *  q2; 

f_fht_ll_hl_uq  =  f_fht_ll_hl_q  *  q2 ; 

f_fht_ll_hh_uq  =  f_fht_ll_hh_q  *  q3 ; 


f_fht_lh_ll_uq  =  f_fht_lh_ll_q  *  q2 
f_fht_lh_lh_uq  =  f_fht_lh_lh_q  *  q2 
f_fht_lh_hl_uq  =  f_fht_lh_hl_q  *  q3 


%  unquantize  of  LH  subbands 
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f_fht_lh_hh_uq  =  f_fht_lh_hh_q  *  q3 ; 

f_fht_hl_ll_uq  =  f_fht_hl_ll_q  *  q2 

f_fht_hl_lh_uq  =  f_fht_hl_lh_q  *  q3 

f_fht_hl_hl_uq  =  f_fht_hl_hl_q  *  q2 

f_fht_hl_hh_uq  =  f_fht_hl_hh_q  *  q3 


f_fht_hh_ll_uq  = 
f_fht_hh_lh_uq  = 
f_fht_hh_hl_uq  = 
f_fht_hh_hh_uq  = 


f_fht_hh_ll_q 
f_fht_hh_lh_q 
f_fht_hh_hl_q 
f_fht_hh_hh_q 


q3 
q3 
q3 

q3 


%  unquantize  of  HL  subbands 


%  unquantize  of  HH  subbands 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Inverse  Transform  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

f_far_ll  =  remake_3(f_far_ll,f_fht_ll_ll_uq,f_fht_ll_lh_uq,f_fht_ll_hl_uq,  ... 

f_fht_ll_hh_uq,m_vec_ndx, 8) ;  %  LL  subband  inv  FHT 

f_far_lh  =  remake_3 (f_far_lh,f_fht_lh_ll_uq, f_fht_lh_lh_uq,f_fht_lh_hl_uq,  ... 

f_fht_lh_hh_uq,m_vec_ndx,  8)  ,-  %  LH  subband  inv  FHT 

f_far_hl  =  remake_3 (f_far_hl, f_fht_hl_ll_uq, f_fht_hl_lh_uq,  f_fht_hl_hl_uq,  ... 

f_fht_hl_hh_uq,m_vec_ndx, 8) ;  %  HL  subband  inv  FHT 

f_far_hh  =  remake_3 (f_far_hh, f_fht_hh_ll_uq, f_fht_hh_lh_uq,  f_fht_hh_hl_uq,  ... 

f_fht_hh_hh_uq,m_vec_ndx, 8) ;  %  HH  subband  inv  FHT 

f_f ar_ll_p2  =  remake_3 ( f_f ar_ll_p2 , f_f ht_ll_ll_uq, f_f ht_ll_lh_uq, . . . 

f_fht_ll_hl_uq, f_fht_ll_hh_uq,m_vec_ndx, 8) ;  %  LL  p2  subband  inv  FHT 
f_f ar_lh_p2  =  remake_3 ( f_f ar_lh_p2 , f_fht_lh_ll_uq, f_f ht_lh_lh_uq, . . . 

f_fht_lh_hl_uq/f_fht_lh_hh_uq,m_vec_ndx, 8) ;  %  LH  p2  subband  inv  FHT 
f_far_hl_p2  =  remake_3 (f_far_hl_p2, f_fht_hl_ll_uq, f_fht_hl_lh_uq, . . . 

f_fht_hl_hl_uq,f_fht_hl_hh_uq,m_vec_ndx, 8) ;  %  HL  p2  subband  inv  FHT 
f_far_hh_p2  =  remake_3 ( f _f ar_hh_p2 , f_fht_hh_ll_uq, 0, 0, . . . 

f_fht_hh_hh_uq,m_vec_ndx, 8) ;  %  HH  p2  subband  inv  FHT 

f_far_ll_pl  =  remake_3 ( f_far_ll_pl, f_fht_ll_ll_uq, f_fht_ll_lh_uq, . . . 

f_fht_ll_hl_uq, 0,m_vec_ndx, 8) ;  %  LL  pi  subband  inv  FHT 

f_far_lh_pl  =  remake_3 ( f _f ar_lh_pl , f_fht_lh_ll_uq, f_fht_lh_lh_uq, . . . 

0, 0,m_vec_ndx, 8) ;  %  LH  pi  subband  inv  FHT 

f_far_hl_pl  =  remake_3 ( f _f ar_hl_pl , f_fht_hl_ll_uq, 0, . . . 

f_fht_hl_hl_uq,0,m_vec_ndx, 8) ;  %  HL  pi  subband  inv  FHT 

f_f ar_hh_pl  =  remake_3 ( f_f ar_hh_pl , f_fht_hh_ll_uq, 0 , . . . 

0, f_fht_hh_hh_uq,m_vec_ndx,8) ;  %  HH  pi  subband  inv  FHT 


f_3  =  remake_3 (f_3, f_far_ll, f_far_lh, f_far_hl, . . . 

f_far_hh,m_vec_ndx, 16) ;  %  frame  inv  FHT  with  3  layers 

f_2  =  remake_3 ( f _2 , f_f ar_ll_p2 , f_f ar_lh_p2 , f_f ar_hl_p2 

f_far_hh_p2,m_vec_ndx, 16) ;  %  frame  inv  FHT  with  2  layers 

f _1  =  remake_3 ( f_l , f_f ar_ll_pl , f _f ar_lh_pl , f_f ar_hl_pl , . . . 

f_far_hh_pl,m_vec_ndx, 16) ;  %  frame  inv  FHT  with  1  layer 

end  %  ends  frame_type  descrimination 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Make  Display  Size  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 


fig_3  =  shape_back ( f _3 ) ; 
fig_2  =  shape_back ( f _2 ) ; 
fig_l  =  shape_back(f_l) ; 

L3  =  round  ( f  ig_3  )  ,- 
L2  =  round (fig_2) ; 
LI  =  round (fig_l) ; 


%  make  display  dimensions 

%  make  display  dimensions 

%  make  display  dimensions 

%  for  viewing 

%  for  viewing 

%  for  viewing 


%%%%%%%%%%%%%%%%%%%%%%%  Calculates  the  mean-square-error  per  frame  %%%%%%%%%%%%%%%%% 
if  mse 

MSE_3L(i+l,l)  =  (sum(sum( (i_present  -  f ig_3 ) . ~2 ) ) ) /176/144 

MSE_2L(i+l,l)  =  (sum(sum( (i_present  -  f ig_2 ) . ~2 ) ) ) /176/144 

MSE_lL(i+l,l)  =  (sum(sum( (i_present  -  f ig_l ) . ~2 ) ) ) /176/144 
end  %  end  if  mse 
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if  display 

%  The  following  lines  show  the  figure  with  3  layers 

subplot (2,2,2) 

image ( L3 ) 

axis  off 

title ( 'Layers  1,  2,  and  3') 

%  The  following  lines  show  the  figure  with  2  layers 

subplot  (2, 2, 3) 

image ( L2 ) 

title ( 'Layers  1  and  2') 

axis  off 

%  The  following  lines  show  the  figure  with  1  layer 

subplot(2,2,4) 

image (LI) 

title ('Layer  1  Only') 

axis  off 

drawnow 

end  %  ends  if  display 

%  closing  reconstruction  occasionally  to  conserve  memory 

if  ( (i==20)  |  (i==40)  | (i==60) | (i==80)  |  (i  ==100)  |  (i  =  =  120)  | (i==140) | (i==160) | (i==180) 
(i==200) | (i==220) | (i==240) | (i==260) | (i==280) ) 
close  all 
end 
end  %  ends  looping  through  frames 

%  The  following  lines  simply  rename  variables  to  be  consistent  with  an  off-line 
%  evaluation  program  used  in  the  development. 

%  LL  dynamic 

BITS_LL_HUFF  =  bits_ll_zz ' ; 


%  LL  static 

BITS_LL_LL_3D  =  bits_ll_ll_r_3D' 
SCAN_LL_LL_22  =  ll_ll_r_22 ' ; 
SCAN_LL_LL_LEN  =  l_p_ll_ll_r_3D' 

BITS_LL_LH_3D  =  bits_ll_lh_r_3D' 
SCAN_LL_LH_22  =  ll_lh_r_22 ' ; 
SCAN_LL_LH_LEN  =  l_p_ll_lh_r_3D' 

BITS_LL_HL_3D  =  bits_ll_hl_r_3D' 
SCAN_LL_HL_22  =  ll_hl_r_22'; 
SCAN_LL_HL_LEN  =  l_p_ll_hl_r_3D' 

BITS_LL_HH_3D  =  bits_ll_hh_r_3D' 
SCAN_LL_HH_22  =  ll_hh_r_22 ' ; 
SCAN_LL_HH_LEN  =  l_p_ll_hh_r_3D' 


%  LH  both  types  of  slide 
BITS_LH_LL_3D  =  bits_lh_ll_r_3D' 
SCAN_LH_LL_22  =  lh_ll_r_22 ' ; 
SCAN_LH_LL_LEN  =  l_p_lh_ll_r_3D' 

BITS_LH_LH_3D  =  bits_lh_lh_r_3D' 
SCAN_LH_LH_22  =  lh_lh_r_22  '  ,- 
SCAN_LH_LH_LEN  =  l_p_lh_lh_r_3D' 

BITS_LH_HL_3D  =  bits_lh_hl_r_3D' 
SCAN_LH_HL_22  =  lh_hl_r_22 ' ; 
SCAN_LH_HL_LEN  =  l_p_lh_hl_r_3D' 

BITS_LH_HH_3D  =  bits_lh_hh_r_3D' 
SCAN_LH_HH_22  =  lh_hh_r_22 ' ; 
SCAN_LH_HH_LEN  =  l_p_lh_hh_r_3D' 


%  HL  both  types  of  slides 


' 


BITS_HL_LL_3D  =  bits_hl_ll_v_3D' 
SCAN_HL_LL_22  =  hl_ll_v_22 ' ; 
SCAN_HL_LL_LEN  =  l_p_hl_ll_v_3D' 


BITS_HL_LH_3D  = 
SCAN_HL_LH_22  = 
SCAN  HL  LH_LEN 


bits_hl_lh_v_3D' 
hl_lh_v_22' ; 
=  ljp_hl_lh_v_3D' 


BITS_HL_HL_3D  =  bits_hl_hl_v_3D' 
SCAN_HL_HL_22  =  hl_hl_v_22'; 
SCAN_HL_HL_LEN  =  l_p_hl_hl_v_3D' 

BITS_HL_HH_3D  =  bits_hl_hh_v_3D' 
SCAN_HL_HH_22  =  hl_hh_v_22 ' ; 
SCAN_HL_HH_LEN  =  l_p_hl_hh_v_3D' 


%  HH  from  dynamic 
BITS_HH_3D  =  bits_hh_r_3D' 
SCAN_HH_22  =  hh_r_22'; 
SCAN_HH_LEN  =  l_p_hh_r_3D' 


%  HH  from  static 

BITS_HH_LL_3D  =  bits_hh_ll_r_3D' 
SCAN_HH_LL_22  =  hh_ll_r_22'; 
SCAN_HH_LL_LEN  =  l_p_hh_ll_r_3D' 

BITS_HH_LH_3D  =  bits_hh_lh_r_3D' 
SCAN_HH_LH_22  =  hh_lh_r_22 ' ; 
SCAN_HH_LH_LEN  =  l_p_hh_lh_r_3D' 

BITS_HH_HL_3D  =  bits_hh_hl_r_3D' 
SCAN_HH_HL_22  =  hh_hl_r_22 ' ; 
SCAN_HH_HL_LEN  =  l_p_hh_hl_r_3D' 

BITS_HH_HH_3D  =  bits_hh_hh_r_3D' 
SCAN_HH_HH_22  =  hh_hh_r_22 ' ; 
SCAN_HH_HH_LEN  =  l_p_hh_hh_r_3 D ' 


%  saves  parameters  for  evaluation  later 
if  write 

s  =  char ( 'g:\thesis\BR' ) ; 

s  =  strcat (s,num2str (br) ) ; 

save ( s ,  ' m_mat_ndx ' , ' BITS_* ' , ' MSE_* ' , 
end  %  if  write 


SCAN 


layer* ' ) 


function  [output]  =  dct_of_fht (input ,m_vec_ndx) 

%  performs  the  2D  DCT  of  input.   m_vec_ndx  identifies  where  this  operation  needs 
%  to  be  performed. 

output  =  zeros (792, 8) ; 
offset  =  -8; 

for  ndx  =1:99 

offset  =  offset  +  8; 

if  m_vec_ndx ( ndx ) 

output(offset+l:offset+8, : )  =  dct2 ( input (of fset+1 :offset+8,  :  )  )  ; 
end 

end 
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function  [f_fht_ll, f_fht_lh, f_fht_hl, f_fht_hh]  =  fht (f_vec,m_vec_ndx, in) 

%  Performs  the  FHT  of  the  appropriate  IN  x  IN  macroblocks  of  F_VEC  as 
%  specified  by  M_VEC_NDX.  Returns  four  sets  of  99  IN/2  x  IN/2  matrices, 
%  each  stacked  into  one  big  99*IN/2  x  IN/2  matrix.  The  places 
%  where  the  fht  was  not  performed  are  filled  with  zeroes  as  place-holders. 

half  =  in/2; 

f_fht_ll  =  zeros (99*half, half) 
f_fht_lh  =  zeros (99*half, half) 
f_fht_hl  =  zeros (99*half, half) 
f_fht_hh  =  zeros (99*half, half) 
mask_lh  =  [1  1;-1  -1] 
mask_hl  =  [1  -1;1  -1] 
mask_hh  =  [1  -1;-1  1] 

offset  =  -in; 

for  ndx  =1:99 

offset=  offset+in; 
oso2  =  offset/2; 

if  m_vec_ndx ( ndx ) 

for  rndx  =  l:half 

for  cndx  =  l:half 

f_fht_ll (rndx+oso2, cndx)  =  ... 

sum(sum(f_vec(rndx*2-l+of f set :rndx*2+of fset,cndx*2-l:cndx*2) ) ) /4; 
f_fht_lh(rndx+oso2,cndx)  =  sum(sum(mask_lh  .*  ... 

f_vec(rndx*2-l+of f set :rndx*2+of f set, cndx*2-l:cndx*2) )  )  /4; 
f_fht_hl (rndx+oso2 , cndx)  =  sum(sum(mask_hl  .*  ... 

f_vec(rndx*2-l+offset:rndx*2+offset,cndx*2-l:cndx*2) ) ) /4; 
f_fht_hh(rndx+oso2 , cndx)  =  sum(sum(mask_hh  .*  ... 

f_vec (rndx*2-l+of f set :rndx*2+of fset, cndx*2-l:cndx*2) ) ) /4; 

end 
end 
end 
end 


function  [bits]  =  get_bits_Huff (parsed) 

%  Uses  Huffman  table  to  get  bits  for  JPEG-based  compression  of  PARSED. 

global  HUFF_TABLE 

bits  =  0; 

[r,c]  =  size (parsed) ; 

for  ndx  =  l:r 

while (parsed(ndx, 1)  >=  15) 

bits  =  bits  +  11; 

parsed (ndx, 1)  =  parsed (ndx, 1)  -  15; 
end 

table_row  =  f ind( (HUFF_TABLE( : , 2)  ==  parsed (ndx, 1) )  &  ... 

(HUFF_TABLE( :,3)  ==  parsed (ndx, 2 ))) ; 
bits  =  bits  +  HUFF_TABLE(table_row, 4)  +  parsed (ndx, 2 ) ; 
end 
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bits  =  bits  +  99*4; 


function  [bits, count]  =  get_bits2 (parsed, kind) 

%  Uses  PARSED  to  fetch  bits  from  custom  VLC  tables 
%  KIND  will  be  1  for  static  scenes 
%  KIND  will  be  0  for  dynamic  scenec 

global  VLC_DYN  VLC_STA 

bits  =  0; 

count  =  0 ; 

parsed  =  abs (parsed) ; 

[r,c]  =  size (parsed) ; 

if  kind 

for  ndx  =  l:r 

poss_rows  =  find(VLC_STA(: ,2)  ==  parsed (ndx, 1) ) ; 
start  =  poss_rows (1) ; 

poss_rows  =  find(VLC_STA( (poss_rows (1) :poss_rows ( length (poss_rows) ) ) , 3)  ==. 
parsed (ndx, 2) ) ; 

if  ~(isempty(poss_rows) ) 

poss_rows  =  poss_rows  +  start  -  1; 

start  =  poss_rows (1) ; 
end 

if  - (isempty(poss_rows) ) 

poss_rows  =  f ind(VLC_STA( (poss_rows (1) :poss_rows (length (poss_rows) )), 4) 
parsed(ndx,3) ) ; 

poss_rows  =  poss_rows  +  start  -  1; 
end 

if  -(isempty(poss_rows) ) 

bits  =  bits  +  VLC_STA(poss_rows, 5) ; 
else 

bits  =  bits  +  22; 

count  =  count  +  1; 
end 

end 

else 

for  ndx  =  l:r 

poss_rows  =  find(VLC_DYN( : , 2)  ==  parsed(ndx, 1) ) ; 

start  =  poss_rows (1) ; 

poss_rows  =  find(VLC_DYN( (poss_rows (1) :poss_rows ( length (poss_rows) ) ) , 3)  ==. 

parsed (ndx, 2) ) ; 

if  - (isempty (poss_rows) ) 

poss_rows  =  poss_rows  +  start  -  1; 

start  =  poss_rows ( 1 ) ; 
end 

if  - (isempty (poss_rows) ) 

poss_rows  =  f ind(VLC_DYN( (poss_rows (1) :poss_rows ( length (poss_rows) ) ) ,4) 

parsed (ndx, 3)  )  ; 

poss_rows  =  poss_rows  +  start  -  1; 
end 

if  -( isempty (poss_rows ) ) 

bits  =  bits  +  VLC_DYN(poss_rows, 5) ; 
else 

bits  =  bits  +  22; 

count  =  count  +  1; 
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end 

end 

end 

bits  =  bits  +  99; 


function  [f]  =  get_next_image (num) 

%  Gets  next  frame  as  a  .bmp  file  converts  to  form  needed  in  MATLAB 

s  =  char ( 'g:\pictures\Head' ) ; 
fn  =  strcat (s,num2str (num) ) ; 
f  =  imread(fn, 'bmp' ) ; 
f  =  double (f (:, :,!)); 


function  [f]  =  get_next_image_2 (num) 

%  Gets  next  frame  as  a  .bmp  file  converts  to  form  needed  in  MATLAB 

s  =  char ( 'g:\pictures\ncaa' ) ; 
fn  =  strcat (s,num2str (num) ) ; 
f  =  imr ead ( f n , ' bmp ' ) ; 
f  =  double (f (:, :,!)); 


function  [Q]  =  get_Ql_matrix(ql) 

%  makes  the  JPEG  standard  quantization  matrix  and  multiplies  it  by  ql. 

%  ql  =  16  will  result  in  no  scaling  of  the  matrix  when  coupled  with  the  rest 

%  of  the  code,   ql  <  16  is  finer  quantization,  i.e.  less  quantization  noise  results. 

Q  =  [16  11  10  16  24  40  51  61; 

12  12  14  19  26  58  60  55; 

14  13  16  24  40  57  69  56; 

14  17  22  29  51  87  80  62; 

18  22  37  56  68  109  103  77; 

24  35  55  64  81  104  113  92; 

49  64  78  87  103  121  120  101; 

72  92  95  98  112  100  103  99]  .*  ql; 


function  [out]  =  get_qd_entryf (flag, default , entry, del ta_Binter,MBnum, beta) 

%  selects  the  appropriate  quantizer  triplet  based  on  the  input  parameters 

%  FLAG  implies  the  first  frame  following  a  scene  change  frame.   DEFAULT  is  the 

%  choice  of  triplet  based  on  test  sequences  and  serves  as  a  starting  point  for  a 

%  new  sequence.   ENTRY  hold  the  table  entry  from  the  previous  frame.   The  remaining 

%  parameters  ara  as  defined  in  the  thesis. 

global  QD_TABLE 

if  (flag) 
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out  =  mint  find (default  >  QD_TABLE ( : , 4 ) ) ) 
else 

deltaQ  =  fix(delta_Binter/MBnum/beta) ; 
out  =  entry  +  deltaQ; 
if  (out  >  17) 

out  =  17; 
end 
if  (out  <  1) 

out  =  1; 
end 
end 


function  [run]  =  get_run(seq_to_code, len) 

%  Called  by  the  parsing  functions  PARSE_3D  and  PARSE_HUFF,  this  function  obtains 
%  the  RUN  field  for  RLE. 

run  =  zeros (len, 1) ; 

count  =  0 ; 
place  =  1; 
mdx  =  1 ; 

while  (mdx  <=  length ( seq_to_code ) ) 
if  seq_to_code (mdx) 
place  =  place  +  1; 
mdx  =  mdx  +  1; 
else 

while  (seq_to_code(mdx)  ==  0) 
mdx  =  mdx  +  1; 
count  =  count  +  1; 
end 
run (place)  =  count; 
count  =  0 ; 
place  =  place  +  1; 
mdx  =  mdx  +  1; 
end 
end 


function  [output]  =  invdct_of_fht (input, m_vec_ndx) 

%  Performs  the  inverse  2D  DCT.   M_VEC_NDX  identifies  where  this  operation  needs  to  be 
%  performed. 

output  =  zeros (792, 8) ; 
offset  =  -8; 

for  ndx  =  1:99 

offset  =  offset  +  8; 

if  m_vec_ndx(ndx) 

output ( off set+1: off set+8, : )  =  idct2 (input (of fset+l:of fset+8, : ) ) ; 
end 

end 
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function  [m_vec_ndx,byAge,b_count,mb_count]  =  m_blk_id_xr (f 1, f2,T) 

%  Figures  out  which  16x16  macroblocks  of  the  144x176  image  need  to  be  further 

%  processed  via  an  absolute  sum  of  differences  between  the  last  frame  fl  and 

%  the  current  frame  f2.  m_vec_ndx  is  99x1  vector  of  0  or  1  raster  scanned. 

%  T  is  the  threshold  utilized.   byAge  is  1  if  macroblocks  are  selected  only  due 

%  to  aging.   byAge  is  0  otherwise.   b_count  and  mb_count  were  used  in  refining  the 

%  search  sequence.   b_count  is  "block  count."   mb_count  is  "macroblock  count."   As 

%  are  artifacts  now,  they  are  commented-out . 

global  LAST  RV 

%b_count  =  0; 
%mb_count  =  0; 
r_count  =  0 ; 
m_vec_ndx  =  zeros (99,1) ; 

for  ndx  =1:99 

if  (RV(ndx)  ==  0) 

m_vec_ndx(ndx)  =  1; 
RV(ndx)  =  floor (21*rand) ; 
else 

RV(ndx)  =  RV(ndx)  -  1; 
end 
end 

byAgel  =  sum(m_vec_ndx) ; 

for  rndx  =  1:16:1584 

r_count  =  r_count  +  1; 

i  f  ( -m_vec_ndx ( r_count ) ) 
go  =  1; 

if  ( (LAST(r_count)  ==  1)  &  (go)) 
for  cndx  =  1 

fl_8x8  =  fl (rndx: rndx+7, cndx: cndx+7) ; 
f2_8x8  =  f 2 (rndx: rndx+7, cndx: cndx+7) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  1; 

LAST ( r_count )  =  1; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  f 1 (rndx+8:rndx+ 15, cndx+8 :cndx+15) ; 
f2_8x8  =  f 2 (rndx+8:rndx+15, cndx+8 :cndx+15) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  2 ,- 

LAST(r_count)  =  3; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl (rndx: rndx+7, cndx+8 :cndx+15) ; 
f2_8x8  =  f2 (rndx:rndx+7, cndx+8 :cndx+15) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID's  position 
%mb_count  =  mb_count  +  1  ,- 
%b_count  =  b_count  +  3 ; 
LAST(r_count)  =  2; 
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go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  f 1 (rndx+8:rndx+15,cndx:cndx+7) ; 
f2_8x8  =  f2 (rndx+8:rndx+15,cndx:cndx+7)  ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1  ,- 

%b_count  =  b_count  +  4 ; 

LAST(r_count)  =  4; 

go  =  0; 
end 

end  %for  cndx 
end  %  if  LAST 

if  ( (LAST(r_count)  ==  2)  &  (go)) 
for  cndx  =  1 

fl_8x8  =  fl(rndx:rndx+7,cndx+8:cndx+15) ; 
f2_8x8  =  f2 (rndx:rndx+7,cndx+8 :cndx+15) ; 
asd  =  abs (sum(sum(f l_8x8-f 2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  1  ,- 

LAST(r_count)  =  2 ; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl (rndx+8:rndx+15,cndx:cndx+7) ; 
f2_8x8  =  f2  (rndx+8:rndx+15,cndx:cndx+7)  ,- 
asd  =  abs (sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx(r_count)  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  2 ; 

LAST(r_count)  =  4; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl(rndx+8:rndx+15,cndx+8:cndx+15) ; 
f2_8x8  =  f2 (rndx+8:rndx+15,cndx+8:cndx+15) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx(r_count)  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1  ,- 

%b_count  =  b_count  +  3 ; 

LAST ( r_count )  =  3; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  f 1 (rndx:rndx+7,cndx:cndx+7) ; 
f2_8x8  =  f2 (rndx:rndx+7, cndx:cndx+7) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx(r_count)  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1 ; 

%b_count  =  b_count  +  4; 

LAST(r_count)  =  1; 

go  =  0; 
end 
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end  %for  cndx 
end  %  if  LAST 

if  ( (LAST(r_count)  ==  3)  &  (go)) 
for  cndx  =  1 

fl_8x8  =  fl(rndx+8:rndx+15,cndx+8:cndx+15) ; 
f2_8x8  =  f2 (rndx+8:rndx+15,cndx+8:cndx+15) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx(r_count)  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  1 ; 

LAST ( r_count )  =  3 ; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl (rndx:rndx+7,cndx:cndx+7) ; 
f2_8x8  =  f2 (rndx:rndx+7,cndx:cndx+7) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  2 ; 

LAST(r_count)  =  1; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl (rndx+8:rndx+15,cndx:cndx+7) ; 
f2_8x8  =  f2 (rndx+8:rndx+15, cndx:cndx+7) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1 ; 

%b_count  =  b_count  +  3 ; 

LAST(r_count)  =  4; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl(rndx:rndx+7,cndx+8:cndx+15) ; 
f2_8x8  =  f2 (rndx:rndx+7,cndx+8:cndx+15) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) )  ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID'S  position 

%mb_count  =  mb_count  +  1 ; 

%b_count  =  b_count  +  4; 

LAST(r_count)  =  2; 

go  =  0; 
end 

end  %for  cndx 
end  %  if  LAST 

if  ( (LAST(r_count)  ==  4)  &  (go)) 
for  cndx  =  1 

fl_8x8  =  f 1 (rndx+8:rndx+15, cndx:cndx+7)  ; 
f2_8x8  =  f2 (rndx+8:rndx+15,cndx:cndx+7) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID'S  position 
%mb_count  =  mb_count  +  1  ,- 
%b_count  =  b_count  +  1; 
LAST ( r_count )  =  4; 
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go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  fl (rndx:rndx+7, cndx+8 :cndx+15)  ; 
f2_8x8  =  f2  (rndx:mdx+7,cndx+8:cndx+15)  ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) )  ; 

if  (asd  >  T) 

m_vec_ndx(r_count)  =  1;  %  ID's  position 

%mb_count  =  mb_count  +  1 ; 

%b_count  =  b_count  +  2 ; 

LAST(r_count)  =  2 ; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  f 1 (rndx:rndx+7, cndx:cndx+7) ; 
f2_8x8  =  f2 (rndx:rndx+7, cndx:cndx+7)  ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID's  position 

%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  3 ; 

LAST ( r_count )  =  1; 

go  =  0; 

break;  %  gets  out  of  the  inner  loop  if  justified  now 
end 

fl_8x8  =  f l(rndx+8:rndx+15, cndx+8: cndx+15 ) ; 
f2_8x8  =  f 2 (rndx+8:rndx+15, cndx+8: cndx+15) ; 
asd  =  abs(sum(sum(fl_8x8-f2_8x8) ) ) ; 

if  (asd  >  T) 

m_vec_ndx ( r_count )  =  1;  %  ID's  position 

%mb_count  =  mb_count  +  1; 

%b_count  =  b_count  +  4 ; 

LAST(r_count)  =  3; 

go  =  0; 
end 


end  %for  cndx 
end  %  if  LAST 

end  %if  -m_vec_ndx 

end  %  for  rndx 

byAge2  =  sum(m_vec_ndx) ; 
byAge  =  (byAgel  ==  byAge2); 


function  [table]  =  make_HAC_table ( ) ; 

%  Generates  the  Huffman  VLC  table 

vecl  =  ones (10, 1) ; 
vec2  =  (1:10) '; 

index  =  (1:162) '; 

rutin  =  [0;vecl*0;vecl;vecl*2;vecl*3  ;vecl*4,-vecl*5;vecl*6; 

vecl*7;vecl*8;vecl*9;vecl*10;vecl*ll;vecl*12;vecl*13; 

vecl*14;vecl*15;15] ; 
siz  =  [0;vec2;vec2;vec2;vec2;vec2;vec2 ;vec2 ;vec2;vec2; 

vec2  ;  vec2  ;  vec2  ;  vec2  ;  vec2  ;  vec2  ;  0 ;  vec2  ]  ; 
cw_len  =  [ 0 ; 2 ; 2 ; 3 ; 4 ; 5 ; 7 ; 8 ; 10 ; 16 ; 16 ; 4 ; 5 ; 7 ; 9 ; 11 ; 16 ; 16 ; 16 ; 16 ; 


95 


1 6 ; 5 ; 8 ; 10 ; 12 ; ones ( 6 , 1 ) * 16 ; 6 ; 9 ; 12 ; ones ( 7 , 1 ) *  16 ; 6 ; 10 ; 
one  s  ( 8 , 1 )  * 1 6 ;  8 ;  1 1 ; one s ( 8 , 1 ) * 1 6 ; 7 ; 1 2 ; o ne  s ( 8 , 1 ) * 1 6 ; 8 ; 
1 2  ;  ones  ( 8 , 1 )  *  1 6  ;  9  ;  ones  ( 9 , 1 )  *  1 6  ;  9  ;  ones  ( 9  , 1 )  *  1 6 ,-  9  ; 
ones (9 , 1) *16; 10 ; ones ( 9 , 1) *16 ; 10 ; ones (9 , 1) *16; 11; 
ones(19,l)*16;ll;ones(10,l)*16] ; 
table  =  [index,  runn,  siz,  cw_len] ; 


function  [a]  =  make_it_compact (big, index, size) 

%  gets  rid  of  trailing  zeros  in  the  vector  BIG  that  resulted  from  scanning  99  matrices 
%  of  dimensions  SIZExSIZE.  INDEX  holds  the  position  of  the  last  non-zero  entry  in  each  of 
%  the  99  matrices. 

entry  =  size~2; 
offset  =  -entry; 
a  =  []; 
for  ndx  =1:99 

offset  =  entry  +  offset; 

if  isinf (index (ndx) ) 
a  =  [a  ;  inf ] ; 

else 

a  =  [a  ;  [big(of f set+l:of f set+index(ndx)  )  ]  ]  ,- 

end 
end 


function  [parsed]  =  parse_3D (vector, index) 

%  RLE's  VECTOR  into  the  {last , run, level}  format.   INDEX  is  the  position  of  the  last  non 
%  zero  entity  in  each  of  the  99  matrices. 

last  =  [  ]  ; 
level  =  [] ; 
run  =  [ ] ; 

for  ndx  =1:99 

if  isinf (index (ndx) ) 

index (ndx)  =  1; 
end 

point  =  sum ( index (l:ndx) ) ; 

seq_to_code  =  vector (point-index (ndx) +1 : point ) ; 
len  =  length ( seq_to_code ) ; 
if  (den  ==  1)  &  isinf  (seq_to_code)  ) 

last  =  [last;0] ; 

level  =  [ level ;0]; 

run  =  [ run ; 0 ] ; 
else 

dummy  =  seq_to_code(f ind(seq_to_code) ) ; 

level  =  [  1  eve  1;  dummy ]  ; 

last  =  [last;zeros(length(dummy) -1, 1)  ;  1] ; 

this_run  =  get_run(seq_to_code, length (dummy) ) ; 

run  =  [run; this_run] ; 
end 
end 

parsed  =  [ last, run, level] ; 
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function  [parsed]  =  parse_Huff (vector, index) 

%  Parses  VECTOR  into  the  JPEG  format.  INDEX  holds  the  position  of  the  last  non-zero 
%  value  of  each  of  the  99  matrices. 

last  =  []; 

level  =  [ ] ; 
run  =  [ ] ; 

for  ndx  =1:99 

if  is inf (index (ndx) ) 

index (ndx)  =  1; 
end 

point  =  sum ( index ( 1 :ndx) ) ; 

seq_to_code  =  vector (point-index (ndx) +1 rpoint) ; 
len  =  length (seq_to_c ode ) ; 
if  (den  ==  1)  &  isinf  (seq_to_code)  ) 

level  =  [ level ; 0 ] ; 

run  =  [ run ; 0 ] ; 
else 

dummy  =  seq_to_code(f ind(seq_to_code) ) ; 

level  =  [level, -dummy]  ,- 

this_run  =  get_run(seq_to_code, length(dummy) ) ; 

run  =  [run; this_run] ; 
end 
end 

level  =  abs(level); 

S  =  zeros (length (level ), 1)  ; 

for  mdx  =  1 : length ( level ) 
if  level (mdx) 

S(mdx)  =  length (dec2bin( level (mdx) )) ; 
else 

S(mdx)  =  0; 
end 
end 

parsed  =  [run,S]; 


function  [ll_q]  =  quantizer_ll ( 1 1 , Q , m_vec_ndx ) 

%  Quantizes  matrix  11  with  quantization  matrix  Q.  M_VEC_NDX  identifies 

%  where  this  quantization  need  be  performed. 

% 

%  see  UNQUANTIZE_LL 

ll_q  =  zeros (792, 8) ; 
offset  =  -8; 
11  =  11*  16; 

for  ndx  =  1:99 

offset  =  offset  +  8; 

if  m_vec_ndx(ndx) 

ll_q(of fset+l:offset+8, : )  =  round ( 11 (of f set+1 : of f set+8,  : )  ./  Q) ; 
end 

end 
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function  [out,rast]  =  raster (mat, in) 

%  Horizontal  raster  scans  the  input  matris  MAT  of  dimensions  INxIN. 

entry  =  in~2; 

offset  =  -entry; 

mat  =  mat . ' ; 

rast  =  reshape (mat, entry*99, 1) ; 

for  ndx  =  1:99 

offset  =  of fset+entry; 

dummy  =  max(f ind (rast (of fset+1 : of fset+entry)  ))  ; 

i  f  i  sempty ( dummy ) 

out (ndx)  =  inf; 
else 

out (ndx)  =  dummy; 
end 

end 

out  =  out . ' ; 


function  [out]  =  remake_3 (f_3, f_fht_ll, f_fht_lh, f_fht_hl, f_fht_hh,m_vec_ndx, in) 

%  Performs  the  inverse  FHT.   F_3  is  the  present  content  of  the  reconstructed  image. 
%  The  next  for  parameters  are  the  subbands  that  update  this  content  as  dictated 
%  by  the  content  of  M_VEC_NDX.   INxIN  is  the  matrix  dimensions. 

half  =  in/2; 

if  (f_fht_ll  ==  0) 

f_fht_ll  =  zeros (99*half, half ) ; 
end 

if  (f_fht_lh  ==  0) 

f_fht_lh  =  zeros (99*half, half)  ; 
end 

if  (f_fht_hl  ==  0) 

f_fht_hl  =  zeros(99*half  ,half)  ; 
end 

if  (f_fht_hh  ==  0) 

f_fht_hh  =  zeros (99*half, half ) ; 
end 

B  =  [111  1;1  1  -1  -1;1  -1  1  -1;1  -1  -1  1]  ; 

x  =  zeros (4,1) 

a  =  zeros (4,1) 

offset  =  -half; 

out  =  zeros  (99*in,  in)  ,- 

f_3  =  f_3  /4; 

for  ndx  =1:99 

offset  =  offset  +  half; 
ost2  =  offset*2; 

if  m_vec_ndx(ndx) 

for  rndx  =  l:half 

for  cndx  =  l:half 

x  =  [f_fht_ll  (rndx+off  set,  cndx)  ,- 
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f_fht_lh(rndx+of fset.cndx) ; 

f_fht_hl (rndx+of fset.cndx) ; 

f_fht_hh(rndx+of fset,cndx) ] ; 
a  =  B  \  x; 
f_3 (rndx*2-l+ost2:rndx*2+ost2,cndx*2-l:cndx*2) 

[a(D  a(2)  ;  a(3)  a(4)]; 


end 
end 
end 
end 
out  =  f_3*4; 


function  [f_out]  =  shape (f_in) 

%  Shapes  the  144x176  f_in  into  a  1584x16  matrix  taken  16  rows  at  a  time 
%  left  to  right,  top  to  bottom,  (raster  scans  the  macroblocks. ) 

f _out  =  [ ] ; 

for  rndx  =  [1  17  33  49  65  81  97  113  129] 

for  cndx  =  [1  17  33  49  65  81  97  113  129  145  161] 

f_out  =  [f_out; [f_in(rndx:rndx+15, cndx:cndx+15) ] ] ; 

end 
end 


function  [f_out]  =  shape_back(f_in) 

%  Shapes  the  1584x16  matrix  back  into  a  144x176  image. 

f _out  =  [ ] ; 
f _row  =  [ ] ; 
offset  =  -176; 

for  i  =  1:9 

offset  =  offset  +  176; 

for  rndx  =  ([1  17  33  49  65  81  97  113  129  145  161]  +  offset) 
f_row  =  [f_row, [f_in(rndx:rndx+15,  : ) ] ] ; 

end 

f_out  =  [f_out;f_row] ; 

f _row  =  [ ] ; 
end 


function  [11]  =  unquantize_ll (ll_q,Q/m_vec_ndx) 

%  Unquantizes  ll_q  with  quantization  matrix  Q.  M_VEC_NDX  idenntifies  where  this 

%  unquantization  need  be  performed. 

% 

%  see  QUANTIZE_LL 

11  =  zeros (792, 8) ; 
offset  =  -8; 
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for  ndx  =1:99 

offset  =  offset  +  8; 

if  m_vec_ndx ( ndx ) 

ll(offset+l:offset+8, :)  =  ll_q(of f set+1 :of fset+8,  : )  .*  Q; 
end 

end 

11  =  11  /  16; 


function  [out, vert]  =  vertical (mat, in) 

%  raster  scans  vertically  top  to  bottom  9  9  matrices  of  size  INxIN  contained  in  MAT. 

offset  =  -in; 

offset2  =  -in^2; 

vert  =  zeros (of f set2*99, 1) ; 

if  (in  ==  8) 

for  ndx  =  1:99 

offset  =  offset  +  8; 
offset2  =  offset2  +  64; 

vert ( of f set2+l :of fset2+64,l)= [mat (off set+1 :of fset+8, 1) ;mat (of f set+1: of fset+8, 2) 
mat (off set+1: of fset+8, 3) ;mat (of f set  +  1: of fset+8, 4) ; mat (of f set+1 : of fset+8,  5) ; 
mat (off set+1: of fset+8, 6) ; mat (of f set+1: of fset+8, 7) ; mat (of f set+1 : of fset+8, 8) ] ; 
dummy  =  max(f ind(vert (of fset2+l:of f set2+64) ) ) ; 

if  isempty (dummy) 

out (ndx)  =  inf; 
else 

out (ndx)  =  dummy; 
end 

end 

else 

for  ndx  =  1:99 

offset  =  offset  +  4; 

offset2  =  offset2  +  16; 

vert(offset2+l:offset2+16, 1) = [mat (of f set+1: of f set+4, 1) ;mat (of f set+1: of f set+4, 2) , 

mat (off set+1: of fset+4,3) ; mat (of f set+1 : of f set+4, 4) ] ; 
dummy  =  max(f indfvert (of f set2+l:of f set2+16) ) ) ; 

if  isempty (dummy) 

out (ndx)  =  inf; 
else 

out (ndx)  =  dummy; 
end 

end 

end 

out  =  out . ' ; 


function  [place, out]  =  zzb(bigmat,M) 

%  zigzag  scans  the  99  matrices  of  size  MxM  contained  in  BIGMAT. 
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vec  =  zeros (M*M, 1) ; 
loop  =  0; 
out  =  [  ]  ; 

for  ndx  =  1:M:M*99 
loop  =  loop  +  1; 
mat  =  bigmat  (ndx:ndx+M-l,  : )  ,- 
vec(l)  =  mat (1, 1) ; 

%  start  scan  at 
x  =  2; 
y  =  1; 
index  =  2 ; 

%  initial  scan  direction 
xstep  =  -1; 
ystep  =  1; 

%  process  each  interior  diagonal 
for  i=2: (2*M-2) 

%  determine  diagonal  length 
if  (i  >  M) 

len  =  2*M  -  i; 
else 

len  =  i; 
end 

%  run  the  diagonal 
for  j  =  l:len 

vec (index)  =  mat(y,x); 

%  move  to  next  point 
x  =  x  +  xstep; 
y  =  y  +  ystep; 
index  =  index  +  1; 
end 

%  set  up  next  pass 
xstep  =  -xstep; 
ystep  =  -ystep; 

if  (x  ==  0) 

if  (y  <=  M) 

x  =  1; 
else 

x  =  x  +  2  ; 

y  =  M; 
end 

elseif  (x  >  M) 
X  =  M; 

y  =  y  +  2; 
end 

if  (y  ==  0) 

if  (x  <=  M) 

y  =  1; 
else 

x  =  M; 

y  =  y  +  2; 
end 

elseif  (y  >  M) 

y  =  M; 

x  =  x  +  2  ; 
end 
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end 

vec(M*M)  =  mat(M,M); 
dummy  =  max  ( f  ind  ( vec )  )  ,- 

if  is empty (dummy) 

place (loop)  =  inf; 

else 

place (loop)  =  dummy; 

end 

out  =  [out; vec] ; 
end 

place  =  place' ; 
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